Preference-based online learning with dueling bandits: A survey

被引：0

作者：

Bengs, Viktor ^{[1
]}

Busa-Fekete, Robert ^{[2
]}

Mesaoudi-Paul, Adil El ^{[1
]}

Hullermeier, Eyke ^{[1
]}

机构：

[1] Heinz Nixdorf Institute, Department of Computer Science, Paderborn University, Germany

[2] Google Research, New York,NY, United States

来源：

Journal of Machine Learning Research | 2021年 / 22卷

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

引用

共 50 条

[31] Preference-based Teaching [J].

Gao, Ziyuan ;

Ries, Christoph ;

Simon, Hans U. ;

Zilles, Sandra .

JOURNAL OF MACHINE LEARNING RESEARCH, 2017, 18 :1-32

[32] Preference-based unawareness [J].

Schipper, Burkhard C. .

MATHEMATICAL SOCIAL SCIENCES, 2014, 70 :34-41

[33] APReL: A Library for Active Preference-based Reward Learning Algorithms [J].

Biyik, Erdem ;

Talati, Aditi ;

Sadigh, Dorsa .

PROCEEDINGS OF THE 2022 17TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION (HRI '22), 2022, :613-617

[34] Preference-Based Assistance Map Learning With Robust Adaptive Oscillators [J].

Li, Shilei ;

Zou, Wulin ;

Duan, Pu ;

Shi, Ling .

IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS, 2022, 4 (04) :1000-1009

[35] Preference-based decision making for personalised access to Learning Resources [J].

Department of Special Education, University of Thessaly, Argonafton and Filellinon Street, Volos, GR 38221, Greece ;

不详 ;

不详 .

Int. J. Auton. Adapt. Commun. Syst., 2008, 3 (356-369) :356-369

[36] A Policy Iteration Algorithm for Learning from Preference-Based Feedback [J].

Wirth, Christian ;

Furnkranz, Johannes .

ADVANCES IN INTELLIGENT DATA ANALYSIS XII, 2013, 8207 :427-437

[37] Efficient Meta Reinforcement Learning for Preference-based Fast Adaptation [J].

Ren, Zhizhou ;

Liu, Anji ;

Liang, Yitao ;

Peng, Jian ;

Ma, Jianzhu .

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,

[38] Active Preference-Based Gaussian Process Regression for Reward Learning [J].

Biyik, Lirdem ;

Huynh, Nicolas ;

Kochenderfer, Mykel J. ;

Sadigh, Dorsa .

ROBOTICS: SCIENCE AND SYSTEMS XVI, 2020,

[39] RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences [J].

Cheng, Jie ;

Xiong, Gang ;

Dai, Xingyuan ;

Miao, Qinghai ;

Lv, Yisheng ;

Wang, Fei-Yue .

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, 2024, 235

[40] Listwise Reward Estimation for Offline Preference-based Reinforcement Learning [J].

Choi, Heewoong ;

Jung, Sangwon ;

Ahn, Hongjoon ;

Moon, Taesup .

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, 2024, 235

← 1 2 3 4 5 →