BERT-based chinese text classification for emergency management with a novel loss function

被引:22
|
作者
Wang, Zhongju [1 ,2 ]
Wang, Long [1 ,2 ,3 ]
Huang, Chao [1 ,2 ]
Sun, Shutong [4 ]
Luo, Xiong [1 ,2 ]
机构
[1] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing 100083, Peoples R China
[2] Beijing Key Lab Knowledge Engn Mat Sci, Beijing 100083, Peoples R China
[3] Univ Sci & Technol Beijing, Shunde Grad Sch, Foshan, Peoples R China
[4] Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing 100083, Peoples R China
关键词
Natural language processing; Deep learning; Text classification; Emergency management; SMOTE; DRIVEN;
D O I
10.1007/s10489-022-03946-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes an automatic Chinese text categorization method for solving the emergency event report classification problem. Since the bidirectional encoder representations from transformers (BERT) has achieved great success in the natural language processing domain, it is employed to derive emergency text features in this study. To overcome the data imbalance problem in the distribution of emergency event categories, a novel loss function is proposed to improve the performance of the BERT-based model. Meanwhile, in order to avoid the negative impacts of the extreme learning rate, the Adabound optimization algorithm that achieves a gradual smooth transition from Adam optimizer to stochastic gradient descent optimizer is employed to learn the parameters of the model. The feasibility and competitiveness of the proposed method are validated on both imbalanced and balanced datasets. Furthermore, the generic BERT, BERT ensemble LSTM-BERT (BERT-LB), Attention-based BiLSTM fused CNN with gating mechanism (ABLG-CNN), TextRCNN, Att-BLSTM, and DPCNN are used as benchmarks on these two datasets. Meanwhile, sampling methods, including random sampling, ADASYN, synthetic minority over-sampling techniques (SMOTE), and Borderline-SMOTE, are employed to verify the performance of the proposed loss function on the imbalance dataset. Compared with benchmarking methods, the proposed method has achieved the best performance in terms of accuracy, weighted average precision, weighted average recall, and weighted average F1 values. Therefore, it is promising to employ the proposed method for real applications in smart emergency management systems.
引用
收藏
页码:10417 / 10428
页数:12
相关论文
共 50 条
  • [31] Imbalanced Chinese Text Classification Based on Weighted Sampling
    Li, Hu
    Zou, Peng
    Han, WeiHong
    Xia, Rongze
    TRUSTWORTHY COMPUTING AND SERVICES, 2014, 426 : 38 - 45
  • [32] Predictive intelligence in harmful news identification by BERT-based ensemble learning model with text sentiment analysis
    Lin, Szu-Yin
    Kung, Yun-Ching
    Leu, Fang-Yie
    INFORMATION PROCESSING & MANAGEMENT, 2022, 59 (02)
  • [33] A Chinese text classification based on active
    Deng, Song
    Li, Qianliang
    Dai, Renjie
    Wei, Siming
    Wu, Di
    He, Yi
    Wu, Xindong
    APPLIED SOFT COMPUTING, 2024, 150
  • [34] MII: A Novel Text Classification Model Combining Deep Active Learning with BERT
    Zhang, Anman
    Li, Bohan
    Wang, Wenhuan
    Wan, Shuo
    Chen, Weitong
    CMC-COMPUTERS MATERIALS & CONTINUA, 2020, 63 (03): : 1499 - 1514
  • [35] MII: A novel text classification model combining deep active learning with BERT
    Zhang A.
    Li B.
    Wang W.
    Wan S.
    Chen W.
    Computers, Materials and Continua, 2020, 63 (03): : 1499 - 1514
  • [36] On Cognitive Level Classification of Assessment-items Using Pre-trained BERT-based Model
    Dipto, Adnan Saif
    Limon, Md. Mahmudur Rahman
    Tuba, Fatima Tanjum
    Uddin, Md Mohsin
    Khan, M. Saddam Hossain
    Tuhin, Rashedul Amin
    PROCEEDINGS OF 2023 7TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2023, 2023, : 245 - 251
  • [37] Chinese Text Classification Based On LDA and KSVM
    Liang, Congwei
    Liu, Yong
    Du, Haiqing
    PROCEEDINGS OF THE 2015 JOINT INTERNATIONAL MECHANICAL, ELECTRONIC AND INFORMATION TECHNOLOGY CONFERENCE (JIMET 2015), 2015, 10 : 379 - 383
  • [38] Chinese long text similarity calculation of semantic progressive fusion based on Bert
    Li, Xiao
    Hu, Lanlan
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2024, 24 (4-5) : 2213 - 2225
  • [39] Financial causal sentence recognition based on BERT-CNN text classification
    Chang-Xuan Wan
    Bo Li
    The Journal of Supercomputing, 2022, 78 : 6503 - 6527
  • [40] BVMHA: Text classification model with variable multihead hybrid attention based on BERT
    Peng, Bo
    Zhang, Tao
    Han, Kundong
    Zhang, Zhe
    Ma, Yuquan
    Ma, Mengnan
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2024, 46 (01) : 1443 - 1454