Finding an Accurate Early Forecasting Model from Small Dataset: A Case of 2019-nCoV Novel Coronavirus Outbreak

被引:103
作者
Fong, Simon James [1 ,2 ]
Li, Gloria [2 ]
Dey, Nilanjan [3 ]
Gonzalez Crespo, Ruben [4 ]
Herrera-Viedma, Enrique [5 ]
机构
[1] Univ Macau, Dept Comp & Informat Sci, Macau, Peoples R China
[2] Chinese Acad Sci, Zhuhai Inst Adv Technol, DACC Lab, Beijing, Peoples R China
[3] Techno India Coll Technol, Dept Informat Technol, Kolkata, W Bengal, India
[4] Univ Int La Rioja, Logrono, Spain
[5] Univ Granada, Granada, Spain
关键词
Forecasting; Machine Learning; Method; Prediction; Data Mining; Epidemic; NEURAL-NETWORKS; SELECTION;
D O I
10.9781/ijimai.2020.02.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Epidemic is a rapid and wide spread of infectious disease threatening many lives and economy damages. It is important to fore-tell the epidemic lifetime so to decide on timely and remedic actions. These measures include closing borders, schools, suspending community services and commuters. Resuming such curfews depends on the momentum of the outbreak and its rate of decay. Being able to accurately forecast the fate of an epidemic is an extremely important but difficult task. Due to limited knowledge of the novel disease, the high uncertainty involved and the complex societal-political factors that influence the widespread of the new virus, any forecast is anything but reliable. Another factor is the insufficient amount of available data. Data samples are often scarce when an epidemic just started. With only few training samples on hand, finding a forecasting model which offers forecast at the best efforts is a big challenge in machine learning. In the past, three popular methods have been proposed, they include 1) augmenting the existing little data, 2) using a panel selection to pick the best forecasting model from several models, and 3) fine-tuning the parameters of an individual forecasting model for the highest possible accuracy. In this paper, a methodology that embraces these three virtues of data mining from a small dataset is proposed. An experiment that is based on the recent coronavirus outbreak originated from Wuhan is conducted by applying this methodology. It is shown that an optimized forecasting model that is constructed from a new algorithm, namely polynomial neural network with corrective feedback (PNN+cf) is able to make a forecast that has relatively the lowest prediction error. The results showcase that the newly proposed methodology and PNN+cf are useful in generating acceptable forecast upon the critical time of disease outbreak when the samples are far from abundant.
引用
收藏
页码:132 / 140
页数:9
相关论文
共 24 条
[1]  
Adhikari R., 2013, ARXIV13026613
[2]  
Andoni A., 2014, P 31 INT C MACH LEAR, V32
[4]  
[Anonymous], IEEE T NEURAL NETWOR
[5]   Exploratory Boosted Feature Selection and Neural Network Framework for Depression Classification [J].
Arun, Vanishri ;
Krishna, Murali ;
Arunkumar, B., V ;
Padma, S. K. ;
Shyam, V .
INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2018, 5 (03) :61-71
[6]   Sales Prediction through Neural Networks for a Small Dataset [J].
Canton Croda, Rosa Maria ;
Gibaja Romero, Damian Emilio ;
Caballero Morales, Santiago Omar .
INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2019, 5 (04) :35-41
[7]  
Cohen J, 2020, SCIENCE, DOI 10.1126/science.abb1256
[8]  
Dey N., 2018, COMMUNICATIONS COMPU, V876
[9]   Rare Events Forecasting Using a Residual-Feedback GMDH Neural Network [J].
Fong, Simon ;
Zhou Nannan ;
Wong, Raymond K. ;
Yang, Xin-She .
12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2012), 2012, :464-473
[10]  
Hyndman R., 2007, FORESIGHT, V6, P12