Constructing a stock-price forecast CNN model with gold and crude oil indicators

被引:62
作者
Chen, Yu-Chen [1 ]
Huang, Wen-Chen [1 ]
机构
[1] Natl Kaohsiung Univ Sci & Technol, Dept Informat Management, 2 Juoyue Rd,Nantsu, Kaohsiung 811, Taiwan
关键词
Stock price forecast; Deep learning; Convolutional neural networks; Long short-term memory; Bayesian optimization; OPTIMIZATION ALGORITHM; NEURAL-NETWORKS; TIME-SERIES; LSTM;
D O I
10.1016/j.asoc.2021.107760
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this study, we propose algorithms to predict future stock market trends based on 8 different input features, including financial technology indicators, gold prices, a gold price volatility index, crude oil price, a crude oil price volatility index, and other characteristic data using two different labeling methods with separate classification algorithms of two and three output categories, respectively including predicted stock price changes (up and down) and recommended trading actions (buy, sell, and hold), and analyze the validity of these characteristic data in terms of their ability to predict future trends. The S&P 500 (GSPC) is the target of these forecasts. Sample data from 2010 to 2018 are divided 8:2, between training and verification data, while data from 2019 are used to test the proposed approach. CNN and LSTM models are used for comparison of classification accuracy and investment returns, respectively. Bayesian optimization (BO) hyperparameters are used to improve the accuracy of the model and increase the return on investment (ROI) of the output predictions. The purpose of this study is to verify whether using gold prices, a gold volatility index, crude oil price, and a crude oil price volatility indices as input features can enable a deep learning model accurately to predict future stock price trends, and to discuss separately the applicability of CNN and LSTM models to the abovementioned characteristics and financial indicators. We also present the results of experiments conducted to evaluate the proposed method in terms of classification accuracy and confusion matrix. In the case of three-category classification, the model takes feature data as input to outputs a predicted trading order on whether to buy, sell, or hold a given set of stocks tomorrow as well as the timing of entry and exit from each position, and also backtests the data outside the sample to find the combination of characteristics and indicators best maximizing ROI. Using this three-category method, we obtain a comprehensive ROI for a given set of individual stocks and assess whether each type of stock is suitable for the prediction model based on input features such as gold and crude oil or the fields that are suitable for the given feature. Experimental results show that the proposed approach as able to predict whether stock price will rise or fall in the next 10 days, and the model accuracy rate can reach 67%. The results of experiments on the proposed combined CNN model with eight features, referred to as CNN8, achieved an ROI on 2019 data outside the sample period of up to 13.23%, which was superior to the 12.08% and 11.06% obtained by the models designed CNN4 (CNN with four input features) and LSTM8(LSTM with eight input features), respectively. The F1 score increased from 0.75 0.79 as a result of applying BO. The results indicate that considering the price of gold, the gold volatility index, crude oil price, and crude oil price volatility index can help obtain better ROI for companies in certain fields, such as the semiconductor, petroleum, and automotive industries, rather than merely considering financial indicators. However, for companies related to apparel, fast food, and copy processing, the input characteristics of purely financial technical indicators were found to be suitable. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:13
相关论文
共 53 条
[1]   Ant Lion Optimizer: A Comprehensive Survey of Its Variants and Applications [J].
Abualigah, Laith ;
Shehab, Mohammad ;
Alshinwan, Mohammad ;
Mirjalili, Seyedali ;
Abd Elaziz, Mohamed .
ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING, 2021, 28 (03) :1397-1416
[2]   Ant Lion Optimization Algorithm for optimal location and sizing of renewable distributed generations [J].
Ali, E. S. ;
Abd Elazim, S. M. ;
Abdelaziz, A. Y. .
RENEWABLE ENERGY, 2017, 101 :1311-1324
[3]  
Altan Aytac, 2020, 2020 4th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), DOI 10.1109/ISMSIT50672.2020.9255181
[4]   A new hybrid model for wind speed forecasting combining long short-term memory neural network, decomposition methods and grey wolf optimizer [J].
Altan, Aytac ;
Karasu, Seckin ;
Zio, Enrico .
APPLIED SOFT COMPUTING, 2021, 100
[5]   Recognition of COVID-19 disease from X-ray images by hybrid model consisting of 2D curvelet transform, chaotic salp swarm algorithm and deep learning technique [J].
Altan, Aytac ;
Karasu, Seckin .
CHAOS SOLITONS & FRACTALS, 2020, 140
[6]  
Altan Aytac, 2018, 2018 6 INT C CONTROL
[7]   RETRACTED: Stock market analysis using candlestick regression and market trend prediction (CKRM) (Retracted Article) [J].
Ananthi, M. ;
Vijayakumar, K. .
JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 12 (05) :4819-4826
[8]  
Anastasakis L., 2001, The Development of Self-Organization Techniques in Modelling: A Review of the Group Method of Data Handling (GMDH)
[9]  
[Anonymous], 2000, Self-organising data mining, DOI DOI 10.1021/ci5001168
[10]   Multiobjective Evolution of Fuzzy Rough Neural Network via Distributed Parallelism for Stock Prediction [J].
Cao, Bin ;
Zhao, Jianwei ;
Lv, Zhihan ;
Gu, Yu ;
Yang, Peng ;
Halgamuge, Saman K. .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2020, 28 (05) :939-952