News-based intelligent prediction of financial markets using text mining and machine learning: A systematic literature review

被引:42
作者
Ashtiani, Matin N. [1 ]
Raahemi, Bijan [1 ]
机构
[1] Univ Ottawa, Telfer Sch Management, Knowledge Discovery & Data Min Lab, 55 Laurier Ave East, Ottawa, ON K1N 6N5, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Stock market prediction; Systematic literature review; Natural language processing; Text mining; Machine learning; ARTIFICIAL NEURAL-NETWORKS; SENTIMENT ANALYSIS; STOCK; ALGORITHMS; RESOURCES; FORECAST; ARTICLES; SUPPORT; IMPACT; RETURN;
D O I
10.1016/j.eswa.2023.119509
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Researchers and practitioners have attempted to predict the financial market by analyzing textual (e.g., news articles and social media) and numeric data (e.g., hourly stock prices, and moving averages). Among textual data, while many papers have been published that analyze social media, news content has gained limited attention in predicting the stock market. Acknowledging that news is critical in predicting the stock market, the focus of this systematic review is on papers investigating machine learning and text mining techniques to predict the stock market using news. Using Kitchenham's methodology, we present a systematic review of the literature on intelligent financial market prediction, examining data mining and machine learning approaches and the employed datasets. From five digital libraries, we identified 61 studies from 2015- 2022 for synthesis and interpretation. We present notable gaps and barriers to predicting financial markets, then recommend future research scopes. Various input data, including numerical (stock prices and technical indicators) and textual data (news text and sentiment), have been employed for news-based stock market prediction. News data collection can be costly and time-consuming: most studies have used custom crawlers to gather news articles; however, there are financial news databases available that could significantly facilitate news collection. Furthermore, although most datasets have covered fewer than 100K records, deep learning and more sophisticated artificial neural networks can process enormous datasets faster, improving future model performance. There is a growing trend toward using artificial neural networks, particularly recurrent neural networks and deep learning models, from 2018 to 2021. Furthermore, regression and gradient-boosting models have been developed for stock market prediction during the last four years. Although word embedding approaches for feature representation have been employed recently with good accuracy, emerging language models may be a focus for future research. Advanced natural language processing methods like transformers have undeniably contributed to intelligent stock market prediction. However, stock market prediction has not yet taken full advantage of them.
引用
收藏
页数:23
相关论文
共 143 条
  • [1] New efficient hybrid candlestick technical analysis model for stock market timing on the basis of the Support Vector Machine and Heuristic Algorithms of Imperialist Competition and Genetic
    Ahmadi, Elham
    Jasemi, Milad
    Monplaisir, Leslie
    Nabavi, Mohammad Amin
    Mahmoodi, Armin
    Jam, Pegah Amini
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2018, 94 : 21 - 31
  • [2] Predict Market Movements Based on the Sentiment of Financial Video News Sites
    Alzazah, Faten
    Cheng, Xiaochun
    Gao, Xiaohong
    [J]. 16TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2022), 2022, : 103 - 110
  • [3] Alzazah FS, 2020, E BUSINESS, V13, DOI [10.5772/intechopen.92253, DOI 10.5772/INTECHOPEN.92253]
  • [4] Classification and Prediction of Stock Market Index Based on Fuzzy Metagraph
    Anbalagan, Thirunavukarasu
    Maheswari, S. Uma
    [J]. GRAPH ALGORITHMS, HIGH PERFORMANCE IMPLEMENTATIONS AND ITS APPLICATIONS (ICGHIA 2014), 2015, 47 : 214 - 221
  • [5] Andrew W. L., 2005, Journal of Investment Consulting, V7, P21, DOI DOI 10.2139/SSM.728864
  • [6] [Anonymous], 2003, P 12 INT C WORLD WID, DOI DOI 10.1145/775152.775226
  • [7] Is all that talk just noise? The information content of Internet stock message boards
    Antweiler, W
    Frank, MZ
    [J]. JOURNAL OF FINANCE, 2004, 59 (03) : 1259 - 1294
  • [8] Intelligent Fraud Detection in Financial Statements Using Machine Learning and Data Mining: A Systematic Literature Review
    Ashtiani, Matin N.
    Raahemi, Bijan
    [J]. IEEE ACCESS, 2022, 10 : 72504 - 72525
  • [9] The Impact of Persian News on Stock Returns Through Text Mining Techniques
    Azizi, Zahra
    Abdolvand, Neda
    Asl, Hassan Ghalibaf
    Harandi, Saeedeh Rajaee
    [J]. IRANIAN JOURNAL OF MANAGEMENT STUDIES, 2021, 14 (04) : 799 - 816
  • [10] Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, 10.48550/arXiv.1409.0473]