Regularized feature selection in reinforcement learning

被引:11
作者
Wookey, Dean S. [1 ]
Konidaris, George D. [2 ,3 ]
机构
[1] Univ Witwatersrand, Sch Comp Sci & Appl Math, Johannesburg, South Africa
[2] Duke Univ, Dept Comp Sci, Durham, NC 27708 USA
[3] Duke Univ, Dept Elect & Comp Engn, Durham, NC 27708 USA
关键词
Feature selection; Reinforcement learning; Function approximation; Regularization; Linear function approximation; OMP-TD;
D O I
10.1007/s10994-015-5518-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce feature regularization during feature selection for value function approximation. Feature regularization introduces a prior into the selection process, improving function approximation accuracy and reducing overfitting. We show that the smoothness prior is effective in the incremental feature selection setting and present closed-form smoothness regularizers for the Fourier and RBF bases. We present two methods for feature regularization which extend the temporal difference orthogonal matching pursuit (OMP-TD) algorithm and demonstrate the effectiveness of the smoothness prior; smooth Tikhonov OMP-TD and smoothness scaled OMP-TD. We compare these methods against OMP-TD, regularized OMP-TD and least squares TD with random projections, across six benchmark domains using two different types of basis functions.
引用
收藏
页码:655 / 676
页数:22
相关论文
共 31 条
[1]  
[Anonymous], 1990, SPLINE MODELS OBSERV
[2]  
[Anonymous], 2012, P 29 INT C MACH LEAR
[3]  
[Anonymous], RC CAR DOMAIN
[4]  
[Anonymous], P 28 INT C MACH LEAR
[5]  
[Anonymous], 2010, Advances in Neural Information Processing Systems (NIPS)
[6]  
[Anonymous], 1998, THESIS MIT
[7]  
[Anonymous], 2020, Reinforcement learning: An introduction
[8]   On the kernel widths in radial-basis function networks [J].
Benoudjit, N ;
Verleysen, M .
NEURAL PROCESSING LETTERS, 2003, 18 (02) :139-154
[9]  
Bradtke SJ, 1996, MACH LEARN, V22, P33, DOI 10.1007/BF00114723
[10]  
Dabney William, 2012, AAAI C ARTIFICIAL IN, V26, P872