Stock market forecasting using a multi-task approach integrating long short-term memory and the random forest framework

被引:77
作者
Park, Hyun Jun [1 ]
Kim, Youngjun [2 ]
Kim, Ha Young [2 ]
机构
[1] Ajou Univ, Dept Data Sci, Worldcupro 206, Suwon 16499, South Korea
[2] Yonsei Univ, Grad Sch Informat, Yonsei Ro 50, Seoul 03722, South Korea
基金
新加坡国家研究基金会;
关键词
Financial time-series forecasting; Hybrid deep learning; Multi-task learning; Explainable Al; Overfitting; NEURAL-NETWORKS; TECHNICAL ANALYSIS; TIME-SERIES; PERFORMANCE; CLASSIFICATION; PARAMETERS; INDEXES; MODEL;
D O I
10.1016/j.asoc.2021.108106
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Numerous studies have adopted deep learning (DL) in financial market forecasting models owing to its superior performance. The DL models require as many relevant input variables as possible to improve performance because they learn high-level features from these inputs. However, as the number of input variables increases, the number of parameters also increases, leaving the model prone to overfitting. We propose a novel stock-market prediction framework (LSTM-Forest) integrating long short-term memory and random forest (RF) to address this issue. We also develop a multi-task model that predicts stock market returns and classifies return directions to improve predictability and profitability. Our model is interpretable because it can identify key variables using the variable importance analysis from RF. We used three global stock indices and 43 financial technical indicators to verify the proposed methods experimentally. The root mean squared errors of LSTM-Forest with multi-task (LFM) in predicting returns from S&P500, SSE, and KOSPI200 were 25.53%, 22.75%, and 16.29% lower, respectively, than those of the baseline RF model. The model's balanced accuracy in forecasting the daily return direction increased by 7.37, 1.68, and 3.79 points, respectively. Furthermore, our multi-task model outperforms our single-task model and previous DL approaches. In the trading test, LFM produced the highest profits-even compared with the long-only strategy when transaction costs were considered. The proposed framework is readily extensible to other tasks and fields that contend with high-dimensionality problems. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:17
相关论文
共 68 条
[1]  
Altan A, 2019, J COGNITIVE SYSTEMS, V4, P17, DOI DOI 10.1186/S41235-019-0167-2
[2]   Modeling and forecasting realized volatility [J].
Andersen, TG ;
Bollerslev, T ;
Diebold, FX ;
Labys, P .
ECONOMETRICA, 2003, 71 (02) :579-625
[3]  
Badawy FA, 2005, ENABLING TECHNOLOGIES FOR THE NEW KNOWLEDGE SOCIETY, P109
[4]  
Badge J., 2012, Journal of Statistical and Econometric Methods, V1, P39
[5]   ModAugNet: A new forecasting framework for stock market index value with an overfitting prevention LSTM module and a prediction LSTM module [J].
Baek, Yujin ;
Kim, Ha Young .
EXPERT SYSTEMS WITH APPLICATIONS, 2018, 113 :457-480
[6]   A deep learning framework for financial time series using stacked autoencoders and long-short term memory [J].
Bao, Wei ;
Yue, Jun ;
Rao, Yulei .
PLOS ONE, 2017, 12 (07)
[7]  
Bekkar M., 2013, J Informa Eng Appl, V3, P27, DOI DOI 10.5121/IJDKP.2013.3402
[8]   Automated trading with performance weighted random forests and seasonality [J].
Booth, Ash ;
Gerding, Enrico ;
McGroarty, Frank .
EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (08) :3651-3661
[9]  
Breiman L., 2001, MACH LEARN, V45, P5
[10]   DESIGN OF EXPERIMENTS ON NEURAL NETWORK'S PARAMETERS OPTIMIZATION FOR TIME SERIES FORECASTING IN STOCK MARKETS [J].
Chen, Mu-Yen ;
Fan, Min-Hsuan ;
Chen, Young-Long ;
Wei, Hui-Mei .
NEURAL NETWORK WORLD, 2013, 23 (04) :369-393