A Q-Learning Approach for Adherence-Aware Recommendations

被引:2
作者
Faros, Ioannis [1 ]
Dave, Aditya [2 ]
Malikopoulos, Andreas A. [1 ,3 ]
机构
[1] Cornell Univ, Syst Engn, Ithaca, NY 14850 USA
[2] Cornell Univ, Sch Civil & Environm Engn, Ithaca, NY 14850 USA
[3] Cornell Univ, Sch Civil & Environm Engn, Ithaca, NY 14850 USA
来源
IEEE CONTROL SYSTEMS LETTERS | 2023年 / 7卷
关键词
Q-learning; Markov decision processes; recommender systems; reinforcement learning; ALGORITHMS;
D O I
10.1109/LCSYS.2023.3339591
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In many real-world scenarios involving high-stakes and safety implications, a human decision-maker (HDM) may receive recommendations from an artificial intelligence while holding the ultimate responsibility of making decisions. In this letter, we develop an "adherence-aware Q-learning" algorithm to address this problem. The algorithm learns the "adherence level" that captures the frequency with which an HDM follows the recommended actions and derives the best recommendation law in real time. We prove the convergence of the proposed Q-learning algorithm to the optimal value and evaluate its performance across various scenarios.
引用
收藏
页码:3645 / 3650
页数:6
相关论文
共 23 条
[1]  
Ackerman E., 2016, IEEE Spectrum
[2]  
[Anonymous], 2003, ADHERENCE LONG TERM
[3]  
Balakrishnan M., 2022, PREPRINT
[4]  
Bastani H, 2024, Arxiv, DOI [arXiv:2108.08454, DOI 10.48550/ARXIV.2108.08454]
[5]  
Borkar V. S., 2008, Stochastic Approximation: A Dynamical Systems Viewpoint, P64
[6]   The Promises and Pitfalls of Robo-Advising [J].
D'Acunto, Francesco ;
Prabhala, Nagpurnanand ;
Rossi, Alberto G. .
REVIEW OF FINANCIAL STUDIES, 2019, 32 (05) :1983-2020
[7]  
Dave A., 2023, P IEEE C DEC CONTR C, P1
[8]   Overcoming Algorithm Aversion: People Will Use Imperfect Algorithms If They Can (Even Slightly) Modify Them [J].
Dietvorst, Berkeley J. ;
Simmons, Joseph P. ;
Massey, Cade .
MANAGEMENT SCIENCE, 2018, 64 (03) :1155-1170
[9]   Algorithm Aversion: People Erroneously Avoid Algorithms After Seeing Them Err [J].
Dietvorst, Berkeley J. ;
Simmons, Joseph P. ;
Massey, Cade .
JOURNAL OF EXPERIMENTAL PSYCHOLOGY-GENERAL, 2015, 144 (01) :114-126
[10]  
Grand-Clement J, 2023, Arxiv, DOI arXiv:2209.01874