An Information-Based Learning Approach to Dual Control

被引:24
作者
Alpcan, Tansu [1 ]
Shames, Iman [1 ]
机构
[1] Univ Melbourne, Dept Elect & Elect Engn, Melbourne, Vic 3010, Australia
基金
澳大利亚研究理事会;
关键词
Active learning; black-box systems; dual control; information theory; nonlinear systems; MODEL-PREDICTIVE CONTROL; EXTREMUM SEEKING; MARKOV; INFERENCE; WATER;
D O I
10.1109/TNNLS.2015.2392122
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dual control aims to concurrently learn and control an unknown system. However, actively learning the system conflicts directly with any given control objective for it will disturb the system during exploration. This paper presents a receding horizon approach to dual control, where a multiobjective optimization problem is solved repeatedly and subject to constraints representing system dynamics. Balancing a standard finite-horizon control objective, a knowledge gain objective is defined to explicitly quantify the information acquired when learning the system dynamics. Measures from information theory, such as entropy-based uncertainty, Fisher information, and relative entropy, are studied and used to quantify the knowledge gained as a result of the control actions. The resulting iterative framework is applied to Markov decision processes and discrete-time nonlinear systems. Thus, the broad applicability and usefulness of the presented approach is demonstrated in diverse problem settings. The framework is illustrated with multiple numerical examples.
引用
收藏
页码:2736 / 2748
页数:13
相关论文
共 49 条
[1]   ENTROPY EXPRESSIONS AND THEIR ESTIMATORS FOR MULTIVARIATE DISTRIBUTIONS [J].
AHMED, NA ;
GOKHALE, DV .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1989, 35 (03) :688-692
[2]  
Alpcan T., 2012, P 20 INT S MATH THEO
[3]  
Alpcan T., 2011, P 5 INT ICST C PERF, P234
[4]   A framework for optimization under limited information [J].
Alpcan, Tansu .
JOURNAL OF GLOBAL OPTIMIZATION, 2013, 55 (03) :681-706
[5]  
[Anonymous], CONTROL SYSTEMS ROBO
[6]  
[Anonymous], 2010, THESIS NTNU
[7]  
[Anonymous], 1991, ELEMENTS INFORM THEO, DOI [DOI 10.1002/0471200611, 10.1002/0471200611]
[8]  
[Anonymous], 1960, Automation and Remote Control
[9]  
[Anonymous], 2006, Pattern recognition and machine learning
[10]   STATISTICAL INFERENCE FOR PROBABILISTIC FUNCTIONS OF FINITE STATE MARKOV CHAINS [J].
BAUM, LE ;
PETRIE, T .
ANNALS OF MATHEMATICAL STATISTICS, 1966, 37 (06) :1554-&