Similarity-based transfer learning of decision policies

被引:0
作者
Zugarova, Eliska [1 ]
Guy, Tatiana, V [1 ]
机构
[1] Czech Acad Sci, Inst Informat Theory & Automat, Dept Adapt Syst, Prague, Czech Republic
来源
2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC) | 2020年
关键词
probabilistic model; transfer learning; closed-loop behavior; fully probabilistic design; Bayesian estimation; sequential decision making;
D O I
10.1109/smc42975.2020.9283093
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We consider a problem of learning decision policy from past experience available. Using the Fully Probabilistic Design (FPD) formalism, we propose a new general approach for finding a stochastic policy from the past data. The proposed approach assigns degree of similarity to all of the past closed-loop behaviors. The degree of similarity expresses how close the current decision making task is to a past task. Then it is used by Bayesian estimation to learn an approximate optimal policy, which comprises the best past experience. The approach learns decision policy directly from the data without interacting with any supervisor/expert or using any reinforcement signal. The past experience may consider a decision objective different than the current one. Moreover the past decision policy need not to be optimal with respect to the past objective. We demonstrate our approach on simulated examples and show that the learned policy achieves better performance than optimal FPD policy whenever a mismodeling is present.
引用
收藏
页码:37 / 44
页数:8
相关论文
共 17 条
  • [1] Abbeel P., 2004, Apprenticeship learning via inverse reinforcement learning. pages, P1, DOI [DOI 10.1145/1015330.1015430, 10.1145/1015330.1015430]
  • [2] Aho AV, 1974, The Design and Analysis of Computer Algorithms
  • [3] Chang KW, 2015, PR MACH LEARN RES, V37, P2058
  • [4] PRIOR DISTRIBUTIONS ON SPACES OF PROBABILITY MEASURES
    FERGUSON, TS
    [J]. ANNALS OF STATISTICS, 1974, 2 (04) : 615 - 629
  • [5] Hummersone C, 2016, ALTERNATIVE BOX PLOT
  • [6] Fully probabilistic control design
    Kárny, M
    Guy, TV
    [J]. SYSTEMS & CONTROL LETTERS, 2006, 55 (04) : 259 - 265
  • [7] Towards fully probabilistic control design
    Karny, M
    [J]. AUTOMATICA, 1996, 32 (12) : 1719 - 1722
  • [8] KARNY M., 2006, ADV INFO KNOW PROC
  • [9] Fully probabilistic design unifies and supports dynamic decision making under uncertainty
    Karny, Miroslav
    [J]. INFORMATION SCIENCES, 2020, 509 : 104 - 118
  • [10] Lazy Fully Probabilistic Design of Decision Strategies
    Karny, Miroslav
    Macek, Karel
    Guy, Tatiana V.
    [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2014, 2014, 8866 : 140 - 149