Deep Direct Reinforcement Learning for Financial Signal Representation and Trading

被引:479
作者
Deng, Yue [1 ,2 ]
Bao, Feng [1 ]
Kong, Youyong [3 ]
Ren, Zhiquan [1 ]
Dai, Qionghai [1 ]
机构
[1] Tsinghua Univ, Automat Dept, Beijing 100084, Peoples R China
[2] Univ Calif San Francisco, Sch Pharm, San Francisco, CA 94158 USA
[3] Southeast Univ, Sch Comp Sci & Engn, Nanjing 210000, Jiangsu, Peoples R China
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
Deep learning (DL); financial signal processing; neural network (NN) for finance; reinforcement learning (RL); FUZZY NEURAL-NETWORK; PREDICTION; LOGIC;
D O I
10.1109/TNNLS.2016.2522401
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Can we train the computer to beat experienced traders for financial assert trading? In this paper, we try to address this challenge by introducing a recurrent deep neural network (NN) for real-time financial signal representation and trading. Our model is inspired by two biological-related learning concepts of deep learning (DL) and reinforcement learning (RL). In the framework, the DL part automatically senses the dynamic market condition for informative feature learning. Then, the RL module interacts with deep representations and makes trading decisions to accumulate the ultimate rewards in an unknown environment. The learning system is implemented in a complex NN that exhibits both the deep and recurrent structures. Hence, we propose a task-aware backpropagation through time method to cope with the gradient vanishing issue in deep training. The robustness of the neural system is verified on both the stock and the commodity future markets under broad testing conditions.
引用
收藏
页码:653 / 664
页数:12
相关论文
共 45 条
[1]   Stock trading using RSPOP: A novel rough set-based neuro-fuzzy approach [J].
Ang, Kai Keng ;
Quek, Chai .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2006, 17 (05) :1301-1315
[2]  
[Anonymous], 2014, Markov decision processes: discrete stochastic dynamic programming
[3]  
[Anonymous], FIELD GUIDE DYNAMICA
[4]  
[Anonymous], 2009, ICML
[5]  
[Anonymous], 2020, Reinforcement Learning, An Introduction
[6]   LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT [J].
BENGIO, Y ;
SIMARD, P ;
FRASCONI, P .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02) :157-166
[7]   Learning Deep Architectures for AI [J].
Bengio, Yoshua .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01) :1-127
[8]  
BEOM HR, 1995, IEEE T SYST MAN CYB, V25, P464, DOI 10.1109/21.364859
[9]  
Bertsekas D. P., 1995, Dynamic programming and optimal control
[10]  
Bransford J. D., 1999, PEOPLE LEARN BRAIN M