FF-BERT: A BERT-based ensemble for automated classification of web-based text on flash flood events

被引:11
|
作者
Wilkho, Rohan Singh [1 ]
Chang, Shi [2 ]
Gharaibeh, Nasir G. [1 ]
机构
[1] Texas A&M Univ, Zachry Dept Civil & Environm Engn, College Stn, TX 77840 USA
[2] Trimble Inc, Westminster, CO 80021 USA
关键词
Flash flood; Text classification; Multi-label text classification; BERT;
D O I
10.1016/j.aei.2023.102293
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The web is a rich information repository that can be mined to uncover additional data about past flash flood (FF) events, currently missing from existing structured databases. However, this information originates from multiple sources (news articles, government records, and weather records among others) and may cover several topics. Furthermore, these topics may be disproportionately covered on the web. The large size and heterogenous nature of web information render manual review difficult. To address this challenge, we have developed a multi-label text classification model, FF-BERT. FF-BERT is designed to classify FF-related web paragraphs into one or more of seven categories: (1) Damage and Economic Impact (DI), (2) Fatalities, Injuries, and Rescue (FIR), (3) Hydrometeorology (HM), (4) Warning and Emergency (WE), (5) Response and Recovery (RR), (6) Public Health (PH), and (7) Mitigation (MG). To develop FF-BERT, we labeled 21,180 paragraphs from FF-related webpages and performed experiments with multiple model architectures based on the widely used language model Bidirectional Encoder Representation from Transformers (BERT). Our final model outperforms the baseline by 11.83%, as measured by the micro-F1 score. In addition, FF-BERT significantly improves the prediction of minority labels (RR-32.1%, PH-260.4%, and MG-138.6%). We demonstrate using real world examples that FF-BERT can be used to uncover new information about flash flood events. This information can be used to enhance existing databases, such as NOAA's Storm Events Database.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Improving BERT-Based Text Classification With Auxiliary Sentence and Domain Knowledge
    Yu, Shanshan
    Su, Jindian
    Luo, Da
    IEEE ACCESS, 2019, 7 : 176600 - 176612
  • [2] A Study of BERT-Based Classification Performance of Text-Based Health Counseling Data
    Sung, Yeol Woo
    Park, Dae Seung
    Kim, Cheong Ghil
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2023, 135 (01): : 795 - 808
  • [3] Three-Branch BERT-Based Text Classification Network for Gastroscopy Diagnosis Text
    Wang Z.
    Zheng X.
    Zhang J.
    Zhang M.
    International Journal of Crowd Science, 2024, 8 (01) : 56 - 63
  • [4] BERT-based Ensemble Approaches for Hate Speech Detection
    Mnassri, Khouloud
    Rajapaksha, Praboda
    Farahbakhsh, Reza
    Crespi, Noel
    2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 4649 - 4654
  • [5] BERT-based chinese text classification for emergency management with a novel loss function
    Zhongju Wang
    Long Wang
    Chao Huang
    Shutong Sun
    Xiong Luo
    Applied Intelligence, 2023, 53 : 10417 - 10428
  • [6] BERT-based chinese text classification for emergency management with a novel loss function
    Wang, Zhongju
    Wang, Long
    Huang, Chao
    Sun, Shutong
    Luo, Xiong
    APPLIED INTELLIGENCE, 2023, 53 (09) : 10417 - 10428
  • [7] Enhancing Arabic Word Sense Disambiguation with Ensemble BERT-Based Models
    Djaidri, Asma
    Aliane, Hassina
    Azzoune, Hamid
    ARABIC LANGUAGE PROCESSING: FROM THEORY TO PRACTICE, ICALP 2023, PT I, 2025, 2339 : 261 - 272
  • [8] Realistic Image Generation from Text by Using BERT-Based Embedding
    Na, Sanghyuck
    Do, Mirae
    Yu, Kyeonah
    Kim, Juntae
    ELECTRONICS, 2022, 11 (05)
  • [9] Hierarchical graph-based text classification framework with contextual node embedding and BERT-based dynamic fusion
    Onan, Aytug
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (07)
  • [10] Chinese Text Classification Method Based on BERT Word Embedding
    Wang, Ziniu
    Huang, Zhilin
    Gao, Jianling
    2020 5TH INTERNATIONAL CONFERENCE ON MATHEMATICS AND ARTIFICIAL INTELLIGENCE (ICMAI 2020), 2020, : 66 - 71