Similarity-based transfer learning of decision policies

被引：0

作者：

Zugarova, Eliska ^{[1
]}

Guy, Tatiana, V ^{[1
]}

机构：

[1] Czech Acad Sci, Inst Informat Theory & Automat, Dept Adapt Syst, Prague, Czech Republic

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC) | 2020年

关键词：

probabilistic model; transfer learning; closed-loop behavior; fully probabilistic design; Bayesian estimation; sequential decision making;

D O I：

10.1109/smc42975.2020.9283093

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

We consider a problem of learning decision policy from past experience available. Using the Fully Probabilistic Design (FPD) formalism, we propose a new general approach for finding a stochastic policy from the past data. The proposed approach assigns degree of similarity to all of the past closed-loop behaviors. The degree of similarity expresses how close the current decision making task is to a past task. Then it is used by Bayesian estimation to learn an approximate optimal policy, which comprises the best past experience. The approach learns decision policy directly from the data without interacting with any supervisor/expert or using any reinforcement signal. The past experience may consider a decision objective different than the current one. Moreover the past decision policy need not to be optimal with respect to the past objective. We demonstrate our approach on simulated examples and show that the learned policy achieves better performance than optimal FPD policy whenever a mismodeling is present.

引用

页码：37 / 44

页数：8

共 17 条

[1] Abbeel P., 2004, Apprenticeship learning via inverse reinforcement learning. pages, P1, DOI [DOI 10.1145/1015330.1015430, 10.1145/1015330.1015430]
[2] Aho AV, 1974, The Design and Analysis of Computer Algorithms
[3] Chang KW, 2015, PR MACH LEARN RES, V37, P2058
[4] PRIOR DISTRIBUTIONS ON SPACES OF PROBABILITY MEASURES
FERGUSON, TS
[J]. ANNALS OF STATISTICS, 1974, 2 (04) : 615 - 629
[5] Hummersone C, 2016, ALTERNATIVE BOX PLOT
[6] Fully probabilistic control design
Kárny, M
Guy, TV
[J]. SYSTEMS & CONTROL LETTERS, 2006, 55 (04) : 259 - 265
[7] Towards fully probabilistic control design
Karny, M
[J]. AUTOMATICA, 1996, 32 (12) : 1719 - 1722
[8] KARNY M., 2006, ADV INFO KNOW PROC
[9] Fully probabilistic design unifies and supports dynamic decision making under uncertainty
Karny, Miroslav
[J]. INFORMATION SCIENCES, 2020, 509 : 104 - 118
[10] Lazy Fully Probabilistic Design of Decision Strategies
Karny, Miroslav
Macek, Karel
Guy, Tatiana V.
[J]. ADVANCES IN NEURAL NETWORKS - ISNN 2014, 2014, 8866 : 140 - 149

← 1 2 →