Preference-based online learning with dueling bandits: A survey

被引：0

作者：

Bengs, Viktor ^{[1
]}

Busa-Fekete, Robert ^{[2
]}

Mesaoudi-Paul, Adil El ^{[1
]}

Hullermeier, Eyke ^{[1
]}

机构：

[1] Heinz Nixdorf Institute, Department of Computer Science, Paderborn University, Germany

[2] Google Research, New York,NY, United States

来源：

Journal of Machine Learning Research | 2021年 / 22卷

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

引用

共 50 条

[21] Active Preference-Based Learning of Reward Functions [J].

Sadigh, Dorsa ;

Dragan, Anca D. ;

Sastry, Shankar ;

Seshia, Sanjit A. .

ROBOTICS: SCIENCE AND SYSTEMS XIII, 2017,

[22] Learning solution similarity in preference-based CBR [J].

Abdel-Aziz, Amira ;

Strickert, Marc ;

Hüllermeier, Eyke .

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8765 :17-31

[23] Versatile Dueling Bandits: Best-of-both World Analyses for Online Learning from Relative Preferences [J].

Saha, Aadirupa ;

Gaillard, Pierre .

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, :19011-19026

[24] Inverse Preference Learning: Preference-based RL without a Reward Function [J].

Hejna, Joey ;

Sadigh, Dorsa .

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,

[25] Online Certification of Preference-Based Fairness for Personalized Recommender Systems [J].

Do, Virginie ;

Corbett-Davies, Sam ;

Atif, Jamal ;

Usunier, Nicolas .

THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, :6532-6540

[26] Online Rank Elicitation for Plackett-Luce: A Dueling Bandits Approach [J].

Szorenyi, Balazs ;

Busa-Fekete, Robert ;

Paul, Adil ;

Huellermeier, Eyke .

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28

[27] A Generalized Acquisition Function for Preference-based Reward Learning [J].

Ellis, Evan ;

Ghosal, Gaurav R. ;

Russell, Stuart J. ;

Dragan, Anca ;

Biyik, Erdem .

2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, :2814-2821

[28] Model-Free Preference-Based Reinforcement Learning [J].

Wirth, Christian ;

Fuernkranz, Johannes ;

Neumann, Gerhard .

THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, :2222-2228

[29] Embedding Learning for Preference-based Speech Quality Assessment [J].

Hu, Cheng-Hung ;

Yasuda, Yusuke ;

Toda, Tomoki .

INTERSPEECH 2024, 2024, :2685-2689

[30] Learning to Identify Top Elo Ratings: A Dueling Bandits Approach [J].

Yan, Xue ;

Du, Yali ;

Ru, Binxin ;

Wang, Jun ;

Zhang, Haifeng ;

Chen, Xu .

THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, :8797-8805

← 1 2 3 4 5 →