An Approach for Training Moral Agents via Reinforcement Learning

被引:0
作者
Gu T. [1 ,2 ]
Gao H. [2 ]
Li L. [1 ,2 ]
Bao X. [2 ]
Li Y. [2 ]
机构
[1] College of Cyber Security, Jinan University, Guangzhou
[2] Guangxi Key Laboratory of Trusted Software (Guilin University of Electronic Technology), Guilin
来源
Jisuanji Yanjiu yu Fazhan/Computer Research and Development | 2022年 / 59卷 / 09期
基金
中国国家自然科学基金;
关键词
Crowdsourcing; Ethical grading; Ethically aligned design; Moral agent; Reinforcement learning;
D O I
10.7544/issn1000-1239.20210474
中图分类号
学科分类号
摘要
Artificial agents such as autonomous vehicles and healthcare robots are playing an increasingly important role in human life, and their moral issues have attracted more and more concerns. To build the ability for agents to comply with basic human ethical norms, a novel approach for training artificial moral agents is proposed based on crowdsourcing and reinforcement learning. Firstly, crowdsourcing is used to obtain sampling data sets of human behaviors, and text clustering and association analysis are used to generate plot graphs and trajectory trees, which define a basic behavior space of agents and present the sequence of behaviors. Secondly, the concept of meta-ethical behavior is proposed, which expands the behavior space of agents by summarizing similar behaviors in different scenarios, and nine kinds of meta-ethical behaviors are extracted from the Code of Daily Behavior of Middle School Students. Finally, a behavior grading mechanism and the corresponding reward and punishment function in reinforcement learning are proposed. By simulating drug purchase scenarios in human life, Q-learning algorithm and DQN (deep Q-networks) algorithm are used to complete the training experiments of moral agent respectively. Experimental results show that the trained agents can complete the expected tasks in ethical manners, which verifies the rationality and effectiveness of the above method. © 2022, Science Press. All right reserved.
引用
收藏
页码:2039 / 2050
页数:11
相关论文
共 35 条
[1]  
Chumachenko D, Chumachenko T., Intelligent agent-based simulation of HIV epidemic process, Proc of the 15th Int Scientific Conf on Intellectual Systems of Decision Making and Problem of Computational Intelligence, pp. 175-188, (2019)
[2]  
Manoharan S., An improved safety algorithm for artificial intelligence enabled processors in self driving cars, Journal of Artificial Intelligence, 1, 2, pp. 95-104, (2019)
[3]  
Zaidan A A, Zaidan B B., A review on intelligent process for smart home applications based on IoT: Coherent taxonomy, motivation, open challenges, and recommendations, Artificial Intelligence Review, 53, 1, pp. 141-165, (2020)
[4]  
Gu Tianlong, Li Long, Artificial moral agents and their design methodology: Retrospect and prospect, Chinese Journal of Computers, 44, 3, pp. 632-651, (2021)
[5]  
Picard R W., Affective Computing, (2000)
[6]  
Anderson M, Anderson S L., Machine Ethics, (2011)
[7]  
Loreggia A, Mattei N, Rossi F, Et al., Preferences and ethical principles in decision making, Proc of the 1st AAAI/ACM Conf on AI, Ethics, and Society, pp. 222-228, (2018)
[8]  
Soares N., The value learning problem, Proc of the 25th Int Joint Conf on Artificial Intelligence, pp. 9-15, (2015)
[9]  
Ho M K, Littman M, MacGlashan J, Et al., Showing versus doing: Teaching by demonstration, Proc of the 30th Int Conf on Neural Information Processing Systems, pp. 3035-3043, (2016)
[10]  
Ravichandar H, Polydoros A S, Chernova S, Et al., Recent advances in robot learning from demonstration, Annual Review of Control, Robotics, and Autonomous Systems, 3, pp. 297-330, (2020)