Reward functions for learning to control in air traffic flow management

被引:36
作者
Cruciol, Leonardo L. B. V. [1 ]
de Arruda, Antonio C., Jr. [1 ]
Li Weigang [1 ]
Li, Leihong [2 ]
Crespo, Antonio M. F.
机构
[1] Univ Brasilia, TransLab, Brasilia, DF, Brazil
[2] Georgia Inst Technol, Atlanta, GA 30332 USA
关键词
Air holding problem; Ground holding problem; Reinforcement learning; Reward function; GROUND-HOLDING PROBLEM; MODEL;
D O I
10.1016/j.trc.2013.06.010
中图分类号
U [交通运输];
学科分类号
08 ; 0823 ;
摘要
Air Traffic Flow Management (ATFM) is a complex decision-making process with multiple stakeholders involved. In this decision loop, a Multi-agent system is developed for both simulation and daily operations to support human decisions. Considering human factors in ATFM, the method of Reinforcement Learning (RL) is suitable in the acquirement of the knowledge and experience of the controllers to assist them in the next control activities. The paper presents the recent development of reinforcement learning and its reward structure for ATFM decision making. Two types of reward functions are proposed for agent-based RL in the application of air traffic management: (I) Reward function considering safety separation and fairness impact among different commercial entities in Ground Holding Problem (GHP) and (2) Reward function considering safety separation in Air Holding Problem (AHP). Real case studies in Brazil are described to show the effectiveness and efficiency of the developed reward functions in the controller decision process of ATFM. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:141 / 155
页数:15
相关论文
共 34 条
[1]   LEARNING INDIRECT ACTIONS IN COMPLEX DOMAINS: ACTION SUGGESTIONS FOR AIR TRAFFIC CONTROL [J].
Agogino, Adrian ;
Tumer, Kagan .
ADVANCES IN COMPLEX SYSTEMS, 2009, 12 (4-5) :493-512
[2]  
[Anonymous], 2010, SERIES ARTIFICIAL IN
[3]  
[Anonymous], 1989, THESIS CAMBRIDGE U
[4]  
[Anonymous], 2016, MULTIAGENT SYSTEMS
[5]   Ground Delay Program Planning Under Uncertainty Based on the Ration-by-Distance Principle [J].
Ball, Michael O. ;
Hoffman, Robert ;
Mukherjee, Avijit .
TRANSPORTATION SCIENCE, 2010, 44 (01) :1-14
[6]   A stochastic integer program with dual network structure and its application to the ground-holding problem [J].
Ball, MO ;
Hoffman, R ;
Odoni, AR ;
Rifkin, R .
OPERATIONS RESEARCH, 2003, 51 (01) :167-171
[7]   Lagrangian delay predictive model for sector-based air traffic flow [J].
Bayen, AM ;
Meyer, G ;
Tomlin, CJ .
JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 2005, 28 (05) :1015-1026
[8]  
Bertsimas D., 2009, INFORMS M SAN DIEG O
[9]  
Bianco L, 2001, TRANSPORTATION ANALY, P95
[10]  
Cherkassky V., 1998, LEARNING DATA CONCEP, V1, P60