FF-BERT: A BERT-based ensemble for automated classification of web-based text on flash flood events

被引:11
|
作者
Wilkho, Rohan Singh [1 ]
Chang, Shi [2 ]
Gharaibeh, Nasir G. [1 ]
机构
[1] Texas A&M Univ, Zachry Dept Civil & Environm Engn, College Stn, TX 77840 USA
[2] Trimble Inc, Westminster, CO 80021 USA
关键词
Flash flood; Text classification; Multi-label text classification; BERT;
D O I
10.1016/j.aei.2023.102293
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The web is a rich information repository that can be mined to uncover additional data about past flash flood (FF) events, currently missing from existing structured databases. However, this information originates from multiple sources (news articles, government records, and weather records among others) and may cover several topics. Furthermore, these topics may be disproportionately covered on the web. The large size and heterogenous nature of web information render manual review difficult. To address this challenge, we have developed a multi-label text classification model, FF-BERT. FF-BERT is designed to classify FF-related web paragraphs into one or more of seven categories: (1) Damage and Economic Impact (DI), (2) Fatalities, Injuries, and Rescue (FIR), (3) Hydrometeorology (HM), (4) Warning and Emergency (WE), (5) Response and Recovery (RR), (6) Public Health (PH), and (7) Mitigation (MG). To develop FF-BERT, we labeled 21,180 paragraphs from FF-related webpages and performed experiments with multiple model architectures based on the widely used language model Bidirectional Encoder Representation from Transformers (BERT). Our final model outperforms the baseline by 11.83%, as measured by the micro-F1 score. In addition, FF-BERT significantly improves the prediction of minority labels (RR-32.1%, PH-260.4%, and MG-138.6%). We demonstrate using real world examples that FF-BERT can be used to uncover new information about flash flood events. This information can be used to enhance existing databases, such as NOAA's Storm Events Database.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Spectral-Spatial Classification of Hyperspectral Images Using BERT-Based Methods With HyperSLIC Segment Embeddings
    Sigirci, Ibrahim Onur
    Bilgin, Gokhan
    IEEE ACCESS, 2022, 10 : 79152 - 79164
  • [42] Extracting Urgent Questions from MOOC Discussions: A BERT-Based Multi-output Classification Approach
    Sultani, Mujtaba
    Daneshpour, Negin
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2025, 50 (02) : 1169 - 1190
  • [43] Research on Public Service Request Text Classification Based on BERT-BiLSTM-CNN Feature Fusion
    Xiong, Yunpeng
    Chen, Guolian
    Cao, Junkuo
    APPLIED SCIENCES-BASEL, 2024, 14 (14):
  • [44] A GAN-BERT Based Approach for Bengali Text Classification with a Few Labeled Examples
    Tanvir, Raihan
    Shawon, Md Tanvir Rouf
    Mehedi, Md Humaion Kabir
    Mahtab, Md Motahar
    Rasel, Annajiat Alim
    19TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE, 2023, 583 : 20 - 30
  • [45] News Text Classification and Recommendation Technology Based on Wide & Deep-Bert Model
    Wu Jing
    Yang Bailong
    2021 IEEE INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SOFTWARE ENGINEERING (ICICSE 2021), 2021, : 209 - 216
  • [46] Text classification for evaluating digital technology adoption maturity based on BERT: An evidence of Industrial AI from China
    Wang, Yanhong
    Gong, Chen
    Ji, Xiaodong
    Yuan, Qi
    TECHNOLOGICAL FORECASTING AND SOCIAL CHANGE, 2025, 211
  • [47] Text Classification Model for Livelihood Issues Based on BERT: A Study Based on Hotline Compliant Data of Zhejiang Province
    Kong X.
    Dong B.
    Xu K.
    Tao Y.
    Beijing Daxue Xuebao (Ziran Kexue Ban)/Acta Scientiarum Naturalium Universitatis Pekinensis, 2023, 59 (03): : 456 - 466
  • [48] Applicability Analysis and Ensemble Application of BERT with TF-IDF, TextRank, MMR, and LDA for Topic Classification Based on Flood-Related VGI
    Du, Wenying
    Ge, Chang
    Yao, Shuang
    Chen, Nengcheng
    Xu, Lei
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2023, 12 (06)
  • [49] Zero-sample text classification algorithm based on BERT and graph convolutional neural network
    Qiao Y.
    Li Y.
    Zhou L.
    Shang X.
    Applied Mathematics and Nonlinear Sciences, 2024, 9 (01)
  • [50] KADO@LT-EDI-ACL2022: BERT-based Ensembles for Detecting Signs of Depression from Social Media Text
    Janatdoust, Morteza
    Ehsani-Besheli, Fatemeh
    Zeinali, Hossein
    PROCEEDINGS OF THE SECOND WORKSHOP ON LANGUAGE TECHNOLOGY FOR EQUALITY, DIVERSITY AND INCLUSION (LTEDI 2022), 2022, : 265 - 269