Ethical Alignment Decision Making for Connected Autonomous Vehicle in Traffic Dilemmas via Reinforcement Learning From Human Feedback

被引:0
作者
Gao, Xin [1 ]
Luan, Tian [2 ]
Li, Xueyuan [1 ]
Liu, Qi [1 ]
Ma, Zhaoyang [3 ]
Meng, Xiaoqiang [1 ]
Li, Zirui [1 ,4 ]
机构
[1] Beijing Inst Technol, Sch Mech Engn, Beijing 100081, Peoples R China
[2] First Automobile Works Grp Corp, Changchun Prod Car Assembly 2, Changchun 130011, Peoples R China
[3] Beijing Jiaotong Univ, Sch Comp & Informat Technol, Beijing 100044, Peoples R China
[4] Tech Univ Dresden, FriedrichList Fac Transport & Traff Sci, Chair Traff Proc Automat, D-01069 Dresden, Germany
来源
IEEE INTERNET OF THINGS JOURNAL | 2024年 / 11卷 / 23期
关键词
Ethics; Decision making; Feature extraction; Autonomous vehicles; Vehicle dynamics; Accidents; Data mining; Coupled ethical module; ethical alignment decision making; multiscale multimodal ethical network; reinforcement learning from human feedback; INTELLIGENT VEHICLES; ROAD; ACCIDENTS; COSTS;
D O I
10.1109/JIOT.2024.3447070
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Since the introduction of the trolley problem, the ethical decision-making conundrum has evolved from autonomous vehicles (AVs) to connected AVs (CAVs), continuing as a prominent challenge. When confronted with ethical dilemmas, CAVs must align their responses not merely with value-neutral human preferences but also with broader moral and ethical frameworks. Consequently, to ensure that CAVs do not engage in actions that contravene established human moral principles, it is imperative that the ethical considerations are meticulously integrated into their decision-making systems. In this article, we introduce an innovative multiscale multimodal ethical network (M2ENet), which aims to align the AV decision-making system with human ethical feedback in ethical dilemma scenarios. First, we extract morphological and dynamic features from sensory information and signal data, respectively, using multiscale multimodal representation. Additionally, ethical policy-based network is devised to enable AVs to comprehend ethical information, which includes the introduction of an ethical alignment factor to ethically align the feature matrix from human feedback. Furthermore, the accuracy of ethical interaction information is improved through coupled ethical module informed by human feedback. Finally, the efficacy of the system is demonstrated through three representative ethical dilemmas in traffic scenarios, employing both simulation experiments and hardware-in-the-loop testing. The simulation experiments reveal that our proposed model can generate decision-making strategies more aligned with human preferences in ethical traffic scenarios. In addition, in our hardware-in-the-loop tests, it is observed that the average percentage of ethical bias weights decreases by 45.06% after 150 episodes of training.
引用
收藏
页码:38585 / 38600
页数:16
相关论文
共 48 条
  • [1] Andrews-Speed C.P., 2004, Energy policy and regulation in the People's Republic of China
  • [2] Asimov I., 2004, I, Robot, Spectra, V1
  • [3] The Moral Machine experiment
    Awad, Edmond
    Dsouza, Sohan
    Kim, Richard
    Schulz, Jonathan
    Henrich, Joseph
    Shariff, Azim
    Bonnefon, Jean-Francois
    Rahwan, Iyad
    [J]. NATURE, 2018, 563 (7729) : 59 - +
  • [4] Measurement of the negative muon spectrum between 0.3 and 40 GeV/c in the atmosphere
    Bellotti, R
    Cafagna, F
    Circella, M
    DeCataldo, G
    DeMarzo, CN
    Giglietto, N
    Spinelli, P
    Golden, RL
    Stephens, SA
    Stochaj, SJ
    Webber, WR
    DePascale, MP
    Morselli, A
    Picozza, P
    Ormes, JF
    Streitmatter, RE
    Brancaccio, FM
    Papini, P
    Piccardi, P
    Spillantini, P
    Basini, G
    Bongiorno, F
    Ricci, M
    Brunetti, MT
    Codino, A
    Grimani, C
    Menichelli, M
    Salvatori, I
    [J]. PHYSICAL REVIEW D, 1996, 53 (01): : 35 - 43
  • [5] Valuing the risk and social costs of road traffic accidents - Seasonal variation and the significance of delay costs
    Bardal, Kjersti Granas
    Jorgensen, Finn
    [J]. TRANSPORT POLICY, 2017, 57 : 10 - 19
  • [6] TrolleyMod v1.0: An Open-Source Simulation and Data Collection Platform for Ethical Decision-Making in Autonomous Vehicles
    Behzadan, Vahid
    Minton, James
    Munir, Arslan
    [J]. AIES '19: PROCEEDINGS OF THE 2019 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, 2019, : 391 - 395
  • [7] Brechtel S, 2014, 2014 IEEE 17TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), P392, DOI 10.1109/ITSC.2014.6957722
  • [8] Weak Human Preference Supervision for Deep Reinforcement Learning
    Cao, Zehong
    Wong, KaiChiu
    Lin, Chin-Teng
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (12) : 5369 - 5378
  • [9] Fortunato M, 2019, Arxiv, DOI arXiv:1706.10295
  • [10] Gao X, 2023, IEEE INT C INTELL TR, P6048, DOI 10.1109/ITSC57777.2023.10422393