Risk-averse autonomous systems: A brief history and recent developments from the perspective of optimal control

被引:19
作者
Wang, Yuheng [1 ]
Chapman, Margaret P. [1 ]
机构
[1] Univ Toronto, Edward S Rogers Sr Dept Elect & Comp Engn, 10 Kings Coll Rd, Toronto, ON M5S 3G8, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Autonomous systems; Intelligent systems; Risk and safety analysis; Optimal control; Reinforcement learning; MODEL-PREDICTIVE CONTROL; VALUE-AT-RISK; TIME MARKOV-PROCESSES; DISCRETE-TIME; SENSITIVE CONTROL; SAFE EXPLORATION; STATE; OPTIMIZATION; MINIMAX; VERIFICATION;
D O I
10.1016/j.artint.2022.103743
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an historical overview about the connections between the analysis of risk and the control of autonomous systems. We offer two main contributions. Our first contribution is to propose three overlapping paradigms to classify the vast body of literature: the worst-case, risk-neutral, and risk-averse paradigms. We consider an appropriate assessment for the risk of an autonomous system to depend on the application at hand. In contrast, it is typical to assess risk using an expectation, variance, or probability alone. Our second contribution is to unify the concepts of risk and autonomous systems. We achieve this by connecting approaches for quantifying and optimizing the risk that arises from a system's behavior across academic fields. The survey is highly multidisciplinary. We include research from the communities of reinforcement learning, stochastic and robust control theory, operations research, and formal verification. We describe both model-based and model-free methods, with emphasis on the former. Lastly, we highlight fruitful areas for further research. A key direction is to blend risk-averse model-based and model-free methods to enhance the real -time adaptive capabilities of systems to improve human and environmental welfare. (C) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页数:25
相关论文
共 136 条
[1]   From inverse optimal control to inverse reinforcement learning: A historical review [J].
Ab Azar, Nematollah ;
Shahmansoorian, Aref ;
Davoudi, Mohsen .
ANNUAL REVIEWS IN CONTROL, 2020, 50 :119-138
[2]   Probabilistic reachability and safety for controlled discrete time stochastic hybrid systems [J].
Abate, Alessandro ;
Prandini, Maria ;
Lygeros, John ;
Sastry, Shankar .
AUTOMATICA, 2008, 44 (11) :2724-2734
[3]   On the coherence of expected shortfall [J].
Acerbi, C ;
Tasche, D .
JOURNAL OF BANKING & FINANCE, 2002, 26 (07) :1487-1503
[4]   DSOS and SDSOS Optimization: More Tractable Alternatives to Sum of Squares and Semidefinite Optimization [J].
Ahmadi, Amir Ali ;
Majumdar, Anirudha .
SIAM JOURNAL ON APPLIED ALGEBRA AND GEOMETRY, 2019, 3 (02) :193-230
[5]   A VARIATIONAL FORMULA FOR RISK-SENSITIVE REWARD [J].
Anantharam, V. ;
Borkar, V. S. .
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2017, 55 (02) :961-988
[6]  
[Anonymous], 2021, RISK
[7]  
[Anonymous], 2018, Risk Management Guidelines
[8]  
[Anonymous], 2016, Robotics: Science and Systems
[9]   A survey of inverse reinforcement learning: Challenges, methods and progress [J].
Arora, Saurabh ;
Doshi, Prashant .
ARTIFICIAL INTELLIGENCE, 2021, 297 (297)
[10]   Coherent measures of risk [J].
Artzner, P ;
Delbaen, F ;
Eber, JM ;
Heath, D .
MATHEMATICAL FINANCE, 1999, 9 (03) :203-228