Crude oil price forecasting incorporating news text

被引:39
作者
Bai, Yun [1 ]
Li, Xixi [2 ]
Yu, Hao [1 ]
Jia, Suling [1 ]
机构
[1] Beihang Univ, Sch Econ & Management, Beijing 100191, Peoples R China
[2] Univ Manchester, Dept Math, Manchester M13 9PL, Lancs, England
关键词
Crude oil price; Text features; News headlines; Multivariate time series; Forecasting; PREDICTION; MODELS;
D O I
10.1016/j.ijforecast.2021.06.006
中图分类号
F [经济];
学科分类号
02 ;
摘要
Sparse and short news headlines can be arbitrary, noisy, and ambiguous, making it difficult for classic topic model LDA (latent Dirichlet allocation) designed for accommodating long text to discover knowledge from them. Nonetheless, some of the existing research about text-based crude oil forecasting employs LDA to explore topics from news headlines, resulting in a mismatch between the short text and the topic model and further affecting the forecasting performance. Exploiting advanced and appropriate methods to construct high-quality features from news headlines becomes crucial in crude oil forecasting. This paper introduces two novel indicators of topic and sentiment for the short and sparse text data to tackle this issue. Empirical experiments show that AdaBoost.RT with our proposed text indicators, with a more comprehensive view and characterization of the short and sparse text data, outperforms the other benchmarks. Another significant merit is that our method also yields good forecasting performance when applied to other futures commodities. (c) 2021 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.
引用
收藏
页码:367 / 383
页数:17
相关论文
共 51 条
  • [1] Aggarwal C.C., 2012, Mining Text Data, DOI [DOI 10.1007/978-1-4614-3223-4_6, 10.1007/978-1-4614-3223-4, 10.1007/978-1-4614-3223-4_6, DOI 10.1007/978-1-4614-3223-4]
  • [2] [Anonymous], 2011, Proceedings of the 22nd International Joint Conference on Artificial Intelligence, DOI DOI 10.5591/978-1-57735-516-8/IJCAI11-298
  • [3] [Anonymous], 2004, Computing Reviews
  • [4] A multi-model approach for describing crude oil price dynamics
    Bernabe, A
    Martina, E
    Alvarez-Ramirez, J
    Ibarra-Valdez, C
    [J]. PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2004, 338 (3-4) : 567 - 584
  • [5] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
  • [6] EMPIRICAL-MODELS FOR THE SPATIAL-DISTRIBUTION OF WILDLIFE
    BUCKLAND, ST
    ELSTON, DA
    [J]. JOURNAL OF APPLIED ECOLOGY, 1993, 30 (03) : 478 - 495
  • [7] Chauhan S. R., 2015, INT J COMPUT APPL, V111, P12
  • [8] ARIMA models to predict next-day electricity prices
    Contreras, J
    Espínola, R
    Nogales, FJ
    Conejo, AJ
    [J]. IEEE TRANSACTIONS ON POWER SYSTEMS, 2003, 18 (03) : 1014 - 1020
  • [9] DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO
  • [10] 2-9