Heavy-Tailed Reinforcement Learning With Penalized Robust Estimator

被引:0
作者
Park, Hyeon-Jun [1 ]
Lee, Kyungjae [1 ]
机构
[1] Chung Ang Univ, Dept Artificial Intelligence, Seoul 06974, South Korea
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Noise measurement; Heavily-tailed distribution; Q-learning; Stochastic processes; Random variables; Object recognition; Markov decision processes; Reinforcement learning; heavy-tailed noise; regret analysis;
D O I
10.1109/ACCESS.2024.3424828
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider finite-horizon episodic reinforcement learning (RL) under heavy-tailed noises, where the p-th moment is bounded for any p is an element of (1,2]. In this setting, existing RL algorithms are limited by their requirement for prior knowledge about the bounded moment order of the noise distribution. This requirement hinders their practical application, as such prior information is rarely available in real-world scenarios. Our proposed method eliminates the need for this prior knowledge, enabling implementation in a wider range of scenarios. We introduce two RL algorithms, p-Heavy-UCRL and p-Heavy-Q-learning, designed for model-based and model-free RL settings, respectively. Without the need for prior knowledge, these algorithms demonstrate robustness to heavy-tailed noise and achieve nearly optimal regret bounds, up to logarithmic terms, with the same dependencies on dominating terms as existing algorithms. Finally, we show that our proposed algorithms have empirically comparable performance to existing algorithms in synthetic tabular scenario.
引用
收藏
页码:107800 / 107817
页数:18
相关论文
共 50 条
  • [31] An Analysis of Transformed Unadjusted Langevin Algorithm for Heavy-Tailed Sampling
    He, Ye
    Balasubramanian, Krishnakumar
    Erdogdu, Murat A.
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2024, 70 (01) : 571 - 593
  • [32] Extreme Value Analysis for Mixture Models with Heavy-Tailed Impurity
    Morozova, Ekaterina
    Panov, Vladimir
    MATHEMATICS, 2021, 9 (18)
  • [33] Learnings and Option Pricing: How Machine Learning Generates an Explainable Heavy-Tailed Solutions of Option Prices
    Kim, Chansoo
    Choi, ByoungSecon
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCES IN SIGNAL PROCESSING AND ARTIFICIAL INTELLIGENCE, ASPAI' 2020, 2020, : 188 - 190
  • [34] A Progressive Bayesian Filtering Framework for Nonlinear Systems With Heavy-Tailed Noises
    Zhang, Jie
    Yang, Xusheng
    Zhang, Wen-An
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (03) : 1918 - 1925
  • [35] Approximate Inference in State-Space Models With Heavy-Tailed Noise
    Agamennoni, Gabriel
    Nieto, Juan I.
    Nebot, Eduardo M.
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2012, 60 (10) : 5024 - 5037
  • [36] Distributed Stochastic Strongly Convex Optimization under Heavy-Tailed Noises
    Sun, Chao
    Chen, Bo
    2024 IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS, CIS AND IEEE INTERNATIONAL CONFERENCE ON ROBOTICS, AUTOMATION AND MECHATRONICS, RAM, CIS-RAM 2024, 2024, : 150 - 155
  • [37] A Variational Bayes-Based Filter with Uncertain Heavy-Tailed Noise
    Dong X.
    Lü R.
    Cai Y.
    Cai, Yunze (yzcai@sjtu.edu.cn), 1600, Shanghai Jiaotong University (54): : 881 - 889
  • [38] A revisit to ruin probabilities in the presence of heavy-tailed insurance and financial risks
    Chen, Yiqing
    Yuan, Zhongyi
    INSURANCE MATHEMATICS & ECONOMICS, 2017, 73 : 75 - 81
  • [39] Sequential fusion estimation for Markov jump systems with heavy-tailed noises
    Li, Hui
    Yan, Liping
    Zhou, Yuqin
    Xia, Yuanqing
    Shi, Xiaodi
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2023, 54 (09) : 1910 - 1925
  • [40] A large deviations approach to limit theory for heavy-tailed time series
    Mikosch, Thomas
    Wintenberger, Olivier
    PROBABILITY THEORY AND RELATED FIELDS, 2016, 166 (1-2) : 233 - 269