Preference-based online learning with dueling bandits: A survey

被引:0
作者
Bengs, Viktor [1 ]
Busa-Fekete, Robert [2 ]
Mesaoudi-Paul, Adil El [1 ]
Hullermeier, Eyke [1 ]
机构
[1] Heinz Nixdorf Institute, Department of Computer Science, Paderborn University, Germany
[2] Google Research, New York,NY, United States
关键词
D O I
暂无
中图分类号
学科分类号
摘要
引用
收藏
相关论文
共 50 条
[31]   Preference-based Teaching [J].
Gao, Ziyuan ;
Ries, Christoph ;
Simon, Hans U. ;
Zilles, Sandra .
JOURNAL OF MACHINE LEARNING RESEARCH, 2017, 18 :1-32
[32]   Preference-based unawareness [J].
Schipper, Burkhard C. .
MATHEMATICAL SOCIAL SCIENCES, 2014, 70 :34-41
[33]   APReL: A Library for Active Preference-based Reward Learning Algorithms [J].
Biyik, Erdem ;
Talati, Aditi ;
Sadigh, Dorsa .
PROCEEDINGS OF THE 2022 17TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION (HRI '22), 2022, :613-617
[34]   Preference-Based Assistance Map Learning With Robust Adaptive Oscillators [J].
Li, Shilei ;
Zou, Wulin ;
Duan, Pu ;
Shi, Ling .
IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS, 2022, 4 (04) :1000-1009
[35]   Preference-based decision making for personalised access to Learning Resources [J].
Department of Special Education, University of Thessaly, Argonafton and Filellinon Street, Volos, GR 38221, Greece ;
不详 ;
不详 .
Int. J. Auton. Adapt. Commun. Syst., 2008, 3 (356-369) :356-369
[36]   A Policy Iteration Algorithm for Learning from Preference-Based Feedback [J].
Wirth, Christian ;
Furnkranz, Johannes .
ADVANCES IN INTELLIGENT DATA ANALYSIS XII, 2013, 8207 :427-437
[37]   Efficient Meta Reinforcement Learning for Preference-based Fast Adaptation [J].
Ren, Zhizhou ;
Liu, Anji ;
Liang, Yitao ;
Peng, Jian ;
Ma, Jianzhu .
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[38]   Active Preference-Based Gaussian Process Regression for Reward Learning [J].
Biyik, Lirdem ;
Huynh, Nicolas ;
Kochenderfer, Mykel J. ;
Sadigh, Dorsa .
ROBOTICS: SCIENCE AND SYSTEMS XVI, 2020,
[39]   RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences [J].
Cheng, Jie ;
Xiong, Gang ;
Dai, Xingyuan ;
Miao, Qinghai ;
Lv, Yisheng ;
Wang, Fei-Yue .
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, 2024, 235
[40]   Listwise Reward Estimation for Offline Preference-based Reinforcement Learning [J].
Choi, Heewoong ;
Jung, Sangwon ;
Ahn, Hongjoon ;
Moon, Taesup .
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, 2024, 235