Learning Human Behavior in Shared Control: Adaptive Inverse Differential Game Approach

被引：6

作者：

Wu, Huai-Ning ^{[1
,2
]}

Wang, Mi ^{[3
]}

机构：

[1] Beihang Univ, Sch Automat Sci & Elect Engn, Sci & Technol Aircraft Control Lab, Beijing 100191, Peoples R China

[2] Peng Cheng Lab, Shenzhen 518000, Peoples R China

[3] Beihang Univ, Sch Automat Sci & Elect Engn, Beijing 100191, Peoples R China

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2024年 / 54卷 / 06期

基金：

中国国家自然科学基金;

关键词：

Behavioral sciences; Cost function; Automation; Symmetric matrices; Games; Task analysis; Nash equilibrium; Adaptive estimation; concurrent learning (CL); human behavior learning; inverse differential game (IDG); shared control; SYSTEMS; DRIVER; IDENTIFICATION; COLLABORATION;

D O I：

10.1109/TCYB.2023.3244559

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

To enhance the collaborative intelligence of a machine, it is important for the machine to understand what behavior a human may adopt to interact with the machine when performing a task in shared control. In this study, an online behavior learning method is proposed for continuous-time linear human-in-the-loop shared control systems by using the system state data only. A two-player nonzero-sum linear quadratic dynamic game paradigm is used for modeling the control interaction between a human operator and an automation that actively compensates for human control action. In this game model, the cost function representing the human behavior is assumed to have an unknown weighting matrix. Here, we want to learn the human behavior or retrieve the weighting matrix by using the system state data only. Accordingly, a new adaptive inverse differential game (IDG) method, which integrates concurrent learning (CL) and linear matrix inequality (LMI) optimization, is proposed. First, a CL-based adaptive law and an interactive controller of the automation are developed to estimate the feedback gain matrix of the human online, and second, an LMI optimization problem is solved to determine the weighting matrix of the human cost function. Finally, simulation results on a cooperative shared control driver assistance system are provided to elucidate the feasibility of the developed method.

引用

页码：3705 / 3715

页数：11

共 49 条

[1] From inverse optimal control to inverse reinforcement learning: A historical review [J].

Ab Azar, Nematollah ;

Shahmansoorian, Aref ;

Davoudi, Mohsen .

ANNUAL REVIEWS IN CONTROL, 2020, 50 :119-138

[2]

Abbeel P., 2004, Proceedings of the Twenty-First International Conference on Machine Learning, P1

[3]

[Anonymous], 2009, PLoS Comput. Biol.

[4] A survey of inverse reinforcement learning: Challenges, methods and progress [J].

Arora, Saurabh ;

Doshi, Prashant .

ARTIFICIAL INTELLIGENCE, 2021, 297 (297)

[5]

Basar T., 1995, Dynamic Noncooperative Game Theory

[6]

Boyd S., 1994, Linear Matrix Inequalities in system and control theory

[7] Hierarchical Bayesian Inverse Reinforcement Learning [J].

Choi, Jaedeug ;

Kim, Kee-Eung .

IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (04) :793-805

[8]

Chowdhary G, 2011, P AMER CONTR CONF, P3547

[9] Concurrent Learning for Convergence in Adaptive Control without Persistency of Excitation [J].

Chowdhary, Girish ;

Johnson, Eric .

49TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2010, :3674-3679

[10] Predictive and linear quadratic methods for potential application to modelling driver steering control [J].

Cole, DJ ;

Pick, AJ ;

Odhams, AMC .

VEHICLE SYSTEM DYNAMICS, 2006, 44 (03) :259-284

← 1 2 3 4 5 →