Machine Learning Modelling and Feature Engineering in Seismology Experiment

被引:17
作者
Brykov, Michail Nikolaevich [1 ]
Petryshynets, Ivan [2 ]
Pruncu, Catalin Iulian [3 ,4 ]
Efremenko, Vasily Georgievich [5 ]
Pimenov, Danil Yurievich [6 ]
Giasin, Khaled [7 ]
Sylenko, Serhii Anatolievich [1 ]
Wojciechowski, Szymon [8 ]
机构
[1] Zaporizhzhia Polytech Natl Univ, UA-69063 Zaporizhzhia, Ukraine
[2] Slovak Acad Sci, Inst Mat Res, Kosice 04001, Slovakia
[3] Imperial Coll London, Mech Engn, Exhibit Rd, London SW7 2AZ, England
[4] Univ Birmingham, Sch Engn, Mech Engn, Birmingham B15 2TT, W Midlands, England
[5] Pryazovskyi State Tech Univ, Phys Dept, UA-87555 Mariupol, Ukraine
[6] South Ural State Univ, Dept Automated Mech Engn, Lenin Prosp 76, Chelyabinsk 454080, Russia
[7] Univ Portsmouth, Sch Mech & Design Engn, Portsmouth PO1 3DJ, Hants, England
[8] Poznan Univ Tech, Fac Mech Engn, Piotrowo 3, PL-60965 Poznan, Poland
关键词
seismology; earthquake prediction; laboratory experiment; acoustic data; machine learning; feature engineering; artificial intelligence; EARTHQUAKE;
D O I
10.3390/s20154228
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
This article aims to discusses machine learning modelling using a dataset provided by the LANL (Los Alamos National Laboratory) earthquake prediction competition hosted by Kaggle. The data were obtained from a laboratory stick-slip friction experiment that mimics real earthquakes. Digitized acoustic signals were recorded against time to failure of a granular layer compressed between steel plates. In this work, machine learning was employed to develop models that could predict earthquakes. The aim is to highlight the importance and potential applicability of machine learning in seismology The XGBoost algorithm was used for modelling combined with 6-fold cross-validation and the mean absolute error (MAE) metric for model quality estimation. The backward feature elimination technique was used followed by the forward feature construction approach to find the best combination of features. The advantage of this feature engineering method is that it enables the best subset to be found from a relatively large set of features in a relatively short time. It was confirmed that the proper combination of statistical characteristics describing acoustic data can be used for effective prediction of time to failure. Additionally, statistical features based on the autocorrelation of acoustic data can also be used for further improvement of model quality. A total of 48 statistical features were considered. The best subset was determined as having 10 features. Its corresponding MAE was 1.913 s, which was stable to the third decimal point. The presented results can be used to develop artificial intelligence algorithms devoted to earthquake prediction.
引用
收藏
页码:1 / 15
页数:14
相关论文
共 31 条
[1]   Earthquake prediction in California using regression algorithms and cloud-based big data infrastructure [J].
Asencio-Cortes, G. ;
Morales-Esteban, A. ;
Shang, X. ;
Martinez-Alvarez, F. .
COMPUTERS & GEOSCIENCES, 2018, 115 :198-210
[2]   Variability in earthquake stress drop and apparent stress [J].
Baltay, Annemarie ;
Ide, Satoshi ;
Prieto, German ;
Beroza, Gregory .
GEOPHYSICAL RESEARCH LETTERS, 2011, 38
[3]   Machine learning for data-driven discovery in solid Earth geoscience [J].
Bergen, Karianne J. ;
Johnson, Paul A. ;
de Hoop, Maarten V. ;
Beroza, Gregory C. .
SCIENCE, 2019, 363 (6433) :1299-+
[4]   Preface to the Focus Section on Machine Learning in Seismology [J].
Bergen, Karianne J. ;
Chen, Ting ;
Li, Zefeng .
SEISMOLOGICAL RESEARCH LETTERS, 2019, 90 (02) :477-480
[5]   Characterizing Acoustic Signals and Searching for Precursors during the Laboratory Seismic Cycle Using Unsupervised Machine Learning [J].
Bolton, David C. ;
Shokouhi, Parisa ;
Rouet-Leduc, Bertrand ;
Hulbert, Claudia ;
Riviere, Jacques ;
Marone, Chris ;
Johnson, Paul A. .
SEISMOLOGICAL RESEARCH LETTERS, 2019, 90 (03) :1088-1098
[6]   Using artificial intelligence models for the prediction of surface wear based on surface isotropy levels [J].
Bustillo, A. ;
Pimenov, D. Yu ;
Matuszewski, M. ;
Mikolajczyk, T. .
ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2018, 53 :215-227
[7]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[8]   Machine Learning Can Predict the Timing and Size of Analog Earthquakes [J].
Corbi, F. ;
Sandri, L. ;
Bedford, J. ;
Funiciello, F. ;
Brizzi, S. ;
Rosenau, M. ;
Lallemand, S. .
GEOPHYSICAL RESEARCH LETTERS, 2019, 46 (03) :1303-1311
[9]   A novel tree-based algorithm to discover seismic patterns in earthquake catalogs [J].
Florido, E. ;
Asencio Cortes, G. ;
Aznarte, J. L. ;
Rubio-Escudero, C. ;
Martinez-Alvarez, F. .
COMPUTERS & GEOSCIENCES, 2018, 115 :96-104
[10]  
Gates A E., 2006, Encyclopedia of earthquakes and volcanoes