A proposal for an approach to mapping susceptibility to landslides using natural language processing and machine learning

被引:13
作者
Rodrigues, Saulo Guilherme [1 ]
Silva, Maisa Mendonca [2 ]
Alencar, Marcelo Hazin [3 ]
机构
[1] Univ Fed Pernambuco, Ctr Acad Agreste CAA, Ave Marielle Franco S-N,Km 59, BR-55014900 Caruaru, PE, Brazil
[2] Univ Fed Pernambuco, Dept Engn Management, Ave Arquitetura Cidade Univ, BR-50740550 Recife, PE, Brazil
[3] Univ Fed Pernambuco UFPE, Res Grp Risk Assessment & Modelling Environm Asse, Recife, PE, Brazil
关键词
Mapping susceptibility; Natural language processing; Machine learning; ARTIFICIAL NEURAL-NETWORKS; DATA MINING TECHNIQUES; RANDOM-FOREST; SPATIAL PREDICTION; FLOOD SUSCEPTIBILITY; DECISION TREE; PERFORMANCE EVALUATION; LOGISTIC-REGRESSION; TEXT CLASSIFICATION; GENETIC ALGORITHM;
D O I
10.1007/s10346-021-01643-3
中图分类号
P5 [地质学];
学科分类号
0709 ; 081803 ;
摘要
Compiling an inventory is a fundamental step for carrying out assessments of landslide hazards. However, data in sufficient quantity and quality are not always available. Thus, this study puts forward an approach for drawing up a landslide inventory using textual data from telephone records, and for mapping hazards of landslides in an urban area. Forty thousand seven hundred ninety-two textual records and the naive Bayes algorithm were used to classify them, and these form the landslide inventory. After creating the inventory, the random forest algorithm with 12 conditioning variables was used to map landslide hazards. The text classification model obtained an accuracy of 0.8671 and a Kappa index of 0.8038. The hazard mapping model obtained accuracy of 0.9503 and an AUC (area under the curve)-ROC (receiver operating characteristics) of 0.9870. The results produced by the model were also compared with real landslides reported in news reports and were shown to be close to what had happened, thus demonstrating the ability of the proposed approach to predict landslides. Finally, the proposed approach can be used in simulation environments, thereby supporting strategic decision-making associated with hazard analysis.
引用
收藏
页码:2515 / 2529
页数:15
相关论文
共 88 条
[1]   Comparison of machine-learning techniques for landslide susceptibility mapping using two-level random sampling (2LRS) in Alakir catchment area, Antalya, Turkey [J].
Ada, Metehan ;
San, B. Taner .
NATURAL HAZARDS, 2018, 90 (01) :237-263
[2]   Landslide susceptibility mapping using Genetic Algorithm for the Rule Set Production (GARP) model [J].
Adineh, Fatemeh ;
Motamedvaziri, Baharak ;
Ahmadi, Hasan ;
Moeini, Abolfazl .
JOURNAL OF MOUNTAIN SCIENCE, 2018, 15 (09) :2013-2026
[3]   Mapping flood susceptibility in an arid region of southern Iraq using ensemble machine learning classifiers: a comparative study [J].
Al-Abadi, Alaa M. .
ARABIAN JOURNAL OF GEOSCIENCES, 2018, 11 (09)
[4]  
Al-Radaideh Qasem A., 2015, International Journal of Knowledge Engineering and Data Mining, V3, P255
[5]   Descriptive and visual summaries of disaster events using artificial intelligence techniques: case studies of Hurricanes Harvey, Irma, and Maria [J].
Alam, Firoj ;
Ofli, Ferda ;
Imran, Muhammad .
BEHAVIOUR & INFORMATION TECHNOLOGY, 2020, 39 (03) :288-318
[6]  
[Anonymous], 2016, EarthExplorer
[7]  
[Anonymous], 2021, IEEE Trans. Broadcast.
[8]  
[Anonymous], 2019, GEOSGB DADOS INFORM
[9]   Impact of climate change on floods in the Brahmaputra basin using CMIP5 decadal predictions [J].
Apurv, Tushar ;
Mehrotra, Rajeshwar ;
Sharma, Ashish ;
Goyal, Manish Kumar ;
Dutta, Subashisa .
JOURNAL OF HYDROLOGY, 2015, 527 :281-291
[10]   GIS-based landslide susceptibility mapping using numerical risk factor bivariate model and its ensemble with linear multivariate regression and boosted regression tree algorithms [J].
Arabameri, Alireza ;
Pradhan, Biswajeet ;
Rezaei, Khalil ;
Sohrabi, Masoud ;
Kalantari, Zahra .
JOURNAL OF MOUNTAIN SCIENCE, 2019, 16 (03) :595-618