DISTRIBUTIONAL PREFERENCE LEARNING: UNDERSTANDING AND ACCOUNTING FOR HIDDEN CONTEXT IN RLHF

被引:0
作者
Siththaranjan, Anand [1 ]
Laidlaw, Cassidy [1 ]
Hadfield-Menell, Dylan [2 ]
机构
[1] University of California, Berkeley, United States
[2] Massachusetts Institute of Technology, United States
来源
12th International Conference on Learning Representations, ICLR 2024 | 2024年
关键词
Compendex;
D O I
暂无
中图分类号
学科分类号
摘要
Reinforcement learning
引用
收藏
相关论文
empty
未找到相关数据