A Hybrid Supervised/Unsupervised Machine Learning Approach to Solar Flare Prediction

被引:59
作者
Benvenuto, Federico [1 ]
Piana, Michele [1 ,2 ]
Campi, Cristina [2 ]
Massone, Anna Maria [2 ]
机构
[1] Univ Genoa, Dipartimento Matemat, Via Dodecaneso 35, I-16146 Genoa, Italy
[2] CNR, SPIN Genova, Via Dodecaneso 35, I-16146 Genoa, Italy
基金
欧盟地平线“2020”;
关键词
methods: data analysis; methods: statistical; Sun: flares; sunspots;
D O I
10.3847/1538-4357/aaa23c
中图分类号
P1 [天文学];
学科分类号
0704 ;
摘要
This paper introduces a novel method for flare forecasting, combining prediction accuracy with the ability to identify the most relevant predictive variables. This result is obtained by means of a two-step approach: first, a supervised regularization method for regression, namely, LASSO is applied, where a sparsity-enhancing penalty term allows the identification of the significance with which each data feature contributes to the prediction; then, an unsupervised fuzzy clustering technique for classification, namely, Fuzzy C-Means, is applied, where the regression outcome is partitioned through the minimization of a cost function and without focusing on the optimization of a specific skill score. This approach is therefore hybrid, since it combines supervised and unsupervised learning; realizes classification in an automatic, skill-score-independent way; and provides effective prediction performances even in the case of imbalanced data sets. Its prediction power is verified against NOAA Space Weather Prediction Center data, using as a test set, data in the range between 1996 August and 2010 December and as training set, data in the range between 1988 December and 1996 June. To validate the method, we computed several skill scores typically utilized in flare prediction and compared the values provided by the hybrid approach with the ones provided by several standard (non-hybrid) machine learning methods. The results showed that the hybrid approach performs classification better than all other supervised methods and with an effectiveness comparable to the one of clustering methods; but, in addition, it provides a reliable ranking of the weights with which the data properties contribute to the forecast.
引用
收藏
页数:9
相关论文
共 30 条
[1]  
[Anonymous], 1973, Pattern Classification and Scene Analysis
[2]  
BALAN N, 2014, J GEOPHYS RES SPACE, V119
[3]  
Bezdek J. C., 1981, Pattern recognition with fuzzy objective function algorithms
[4]   TOWARD RELIABLE BENCHMARKING OF SOLAR FLARE FORECASTING METHODS [J].
Bloomfield, D. Shaun ;
Higgins, Paul A. ;
McAteer, R. T. James ;
Gallagher, Peter T. .
ASTROPHYSICAL JOURNAL LETTERS, 2012, 747 (02)
[5]   SOLAR FLARE PREDICTION USING SDO/HMI VECTOR MAGNETIC FIELD DATA WITH A MACHINE-LEARNING ALGORITHM [J].
Bobra, M. G. ;
Couvidat, S. .
ASTROPHYSICAL JOURNAL, 2015, 798 (02)
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[8]   Automated Solar Activity Prediction: A hybrid computer platform using machine learning and solar imaging for automated prediction of solar flares [J].
Colak, T. ;
Qahwaji, R. .
SPACE WEATHER-THE INTERNATIONAL JOURNAL OF RESEARCH AND APPLICATIONS, 2009, 7
[9]  
CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
[10]   Active-region monitoring and flare forecasting - I. Data processing and first results [J].
Gallagher, PT ;
Moon, YJ ;
Wang, HM .
SOLAR PHYSICS, 2002, 209 (01) :171-183