A Heterogeneous Directed Graph Attention Network for inductive text classification using multilevel semantic embeddings

被引:2
作者
Lin, Mu [1 ]
Wang, Tao [1 ]
Zhu, Yifan [1 ]
Li, Xiaobo [1 ]
Zhou, Xin [1 ]
Wang, Weiping [1 ]
机构
[1] Natl Univ Def Technol, Coll Syst Engn, Changsha 410073, Hunan, Peoples R China
关键词
Text classification; Multilevel semantics; Graph Neural Networks; Graph Attention Networks; Text segmentation; Sentence-transformer; MODEL;
D O I
10.1016/j.knosys.2024.111797
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the current study, a novel network model is proposed for text classification based on Graph Attention Networks (GATs) and sentence -transformer embeddings. Most existing methods with a pretraining model as an input layer still treat words as the minimum processing unit. However, word embedding is not an efficient and appropriate solution when dealing with long texts containing many professional words. This study aims to design a model capable of handling text classification tasks at multilevel semantic segmentation. The main contribution of this study is that a novel GAT variant is designed using global nodes and Squeeze -and -Excitation Networks (SENet) to capture semantic information. Moreover, a novel unidirectional attention mechanism is introduced for our model to avoid the message passing of irrelevant noisy information within global nodes. The numerical results show that according to the characteristics of datasets, specific semantic information combinations can effectively improve the accuracy and performance of text classification. Without fine-tuning of the pretrained encoder, the new state-of-the-art performance is achieved on three benchmark datasets. In addition, a comprehensive analysis of the graph attention mechanism in the model for specific cases suggests that the unidirectional attention mechanism and the use of global nodes are key contributing factors to multilevel semantic fusion.
引用
收藏
页数:16
相关论文
共 54 条
  • [1] Hierarchical Attentional Hybrid Neural Networks for Document Classification
    Abreu, Jader
    Fred, Luis
    Macedo, David
    Zanchettin, Cleber
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: WORKSHOP AND SPECIAL SESSIONS, 2019, 11731 : 396 - 402
  • [2] Adhikari A, 2019, Arxiv, DOI [arXiv:1904.08398, 10.48550/ARXIV.1904.08398]
  • [3] Beltagy I, 2020, Arxiv, DOI arXiv:2004.05150
  • [4] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
  • [5] Brody S., 2022, 10 INT C LEARN REPR, P1
  • [6] Graph Fusion Network for Text Classification
    Dai, Yong
    Shou, Linjun
    Gong, Ming
    Xia, Xiaolin
    Kang, Zhao
    Xu, Zenglin
    Jiang, Daxin
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 236
  • [7] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [8] Ding M, 2020, ADV NEUR IN, V33, P12792
  • [9] Limitations of Transformers on Clinical Text Classification
    Gao, Shang
    Alawad, Mohammed
    Young, M. Todd
    Gounley, John
    Schaefferkoetter, Noah
    Yoon, Hong Jun
    Wu, Xiao-Cheng
    Durbin, Eric B.
    Doherty, Jennifer
    Stroup, Antoinette
    Coyle, Linda
    Tourassi, Georgia
    [J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2021, 25 (09) : 3596 - 3607
  • [10] Gao S, 2018, REPRESENTATION LEARNING FOR NLP, P11