Explainable Machine Learning Exploiting News and Domain-Specific Lexicon for Stock Market Forecasting

被引：35

作者：

Carta, Salvatore M. ^{[1
]}

Consoli, Sergio ^{[2
]}

Piras, Luca ^{[1
]}

Podda, Alessandro Sebastian ^{[1
]}

Recupero, Diego Reforgiato ^{[1
]}

机构：

[1] Univ Cagliari, Dept Math & Comp Sci, I-09124 Cagliari, Italy

[2] European Commiss, Joint Res Ctr DG JRC, I-21027 Ispra, Italy

来源：

IEEE ACCESS | 2021年 / 9卷

关键词：

Forecasting; Social networking (online); Companies; Stock markets; Feature extraction; Task analysis; Prediction algorithms; Stock market forecasting; machine learning; natural language processing; financial technology; explainable artificial intelligence; PREDICTION; RETURN;

D O I：

10.1109/ACCESS.2021.3059960

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this manuscript, we propose a Machine Learning approach to tackle a binary classification problem whose goal is to predict the magnitude (high or low) of future stock price variations for individual companies of the S&P 500 index. Sets of lexicons are generated from globally published articles with the goal of identifying the most impactful words on the market in a specific time interval and within a certain business sector. A feature engineering process is then performed out of the generated lexicons, and the obtained features are fed to a Decision Tree classifier. The predicted label (high or low) represents the underlying company's stock price variation on the next day, being either higher or lower than a certain threshold. The performance evaluation we have carried out through a walk-forward strategy, and against a set of solid baselines, shows that our approach clearly outperforms the competitors. Moreover, the devised Artificial Intelligence (AI) approach is explainable, in the sense that we analyze the white-box behind the classifier and provide a set of explanations on the obtained results.

引用

页码：30193 / 30205

页数：13

共 52 条

[1]

Adhikari Binay K., 2014, International Journal of Financial Markets and Derivatives, V3, P222, DOI 10.1504/IJFMD.2014.059637

[2]

[Anonymous], 2007, P HUMAN LANGUAGE TEC

[3]

[Anonymous], 1995, P 33 ANN M ASS COMP

[4]

Atkins A., 2018, The Journal of Finance and Data Science, V4, P120, DOI [10.1016/j.jfds.2018.02.002, DOI 10.1016/J.JFDS.2018.02.002]

[5] Using frame-based resources for sentiment analysis within the financial domain [J].

Atzeni, Mattia ;

Dridi, Amna ;

Recupero, Diego Reforgiato .

PROGRESS IN ARTIFICIAL INTELLIGENCE, 2018, 7 (04) :273-294

[6] Personal Knowledge Graphs: A Research Agenda [J].

Balog, Krisztian ;

Kenter, Tom .

PROCEEDINGS OF THE 2019 ACM SIGIR INTERNATIONAL CONFERENCE ON THEORY OF INFORMATION RETRIEVAL (ICTIR'19), 2019, :216-219

[7] Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI [J].

Barredo Arrieta, Alejandro ;

Diaz-Rodriguez, Natalia ;

Del Ser, Javier ;

Bennetot, Adrien ;

Tabik, Siham ;

Barbado, Alberto ;

Garcia, Salvador ;

Gil-Lopez, Sergio ;

Molina, Daniel ;

Benjamins, Richard ;

Chatila, Raja ;

Herrera, Francisco .

INFORMATION FUSION, 2020, 58 :82-115

[8] Probabilistic Topic Models [J].

Blei, David M. .

COMMUNICATIONS OF THE ACM, 2012, 55 (04) :77-84

[9] Information, Trading, and Volatility: Evidence from Firm-Specific News [J].

Boudoukh, Jacob ;

Feldman, Ronen ;

Kogan, Shimon ;

Richardson, Matthew .

REVIEW OF FINANCIAL STUDIES, 2019, 32 (03) :992-1033

[10] An evaluation of volatility forecasting techniques [J].

Brailsford, TJ ;

Faff, RW .

JOURNAL OF BANKING & FINANCE, 1996, 20 (03) :419-438

← 1 2 3 4 5 6 →