Heterogeneous graph contrastive learning with adaptive data augmentation for semi-supervised short text classification

被引:0
|
作者
Wu, Mingqiang [1 ]
Xu, Zhuoming [1 ]
Zheng, Lei [1 ]
机构
[1] Hohai Univ, Coll Comp Sci & Software Engn, Nanjing 211100, Jiangsu, Peoples R China
关键词
data augmentation; heterogeneous graph contrastive learning; semi-supervised short text classification; short text clustering; soft prompt;
D O I
10.1111/exsy.13744
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Short text classification has been widely used in many fields. Due to the scarcity of labelled data, implementing short text classification under semi-supervised learning setting has become increasingly popular. Semi-supervised short text classification methods based on graph neural networks can achieve state-of-the-art classification performance by utilizing the expressive power of graph neural networks. However, these methods usually fail to mine the hidden patterns of a large amount of short text node data in the graph to optimize the short text node embeddings, which limits the semantic representation power of the short texts, thus leading to suboptimal classification performance. To overcome the limitation, this paper proposes a novel semi-supervised short text classification method called the Heterogeneous Graph Contrastive Learning with Adaptive Data Augmentation (HGCLADA). In the knowledge bases guided soft prompt-based data augmentation component, the related words of the tag words are used to optimize the soft prompts for generating diverse augmented samples. In the heterogeneous graph contrastive learning framework component, a heterogeneous graph that is constructed using short texts and keywords and an effective edge augmentation scheme based on a short text clustering algorithm are proposed. The optimized short text embeddings can be obtained to achieve the effective semi-supervised short text classification. Extensive experiments on six benchmark datasets show that our HGCLADA method outperforms four classes of state-of-the-art methods in terms of classification accuracy, especially with significant performance improvements of 8.74% on the TagMyNews dataset when each class only contains 20 labelled data.
引用
收藏
页数:28
相关论文
共 50 条
  • [1] Heterogeneous Graph Attention Networks for Semi-supervised Short Text Classification
    Hu, Linmei
    Yang, Tianchi
    Shi, Chuan
    Ji, Houye
    Li, Xiaoli
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 4821 - 4830
  • [2] Semi-Supervised Graph Contrastive Learning With Virtual Adversarial Augmentation
    Dong, Yixiang
    Luo, Minnan
    Li, Jundong
    Liu, Ziqi
    Zheng, Qinghua
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (08) : 4232 - 4244
  • [3] HGAT: Heterogeneous Graph Attention Networks for Semi-supervised Short Text Classification
    Yang, Tianchi
    Hu, Linmei
    Shi, Chuan
    Ji, Houye
    Li, Xiaoli
    Nie, Liqiang
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2021, 39 (03)
  • [4] Semi-Supervised Heterogeneous Graph Learning with Multi-Level Data Augmentation
    Chen, Ying
    Qiang, Siwei
    Ha, Mingming
    Liu, Xiaolei
    Li, Shaoshuai
    Tong, Jiabi
    Yuan, Lingfeng
    Guo, Xiaobo
    Zhu, Zhenfeng
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2024, 18 (02)
  • [5] Contrastive multi-graph learning with neighbor hierarchical sifting for semi-supervised text classification
    Ai, Wei
    Li, Jianbin
    Wang, Ze
    Wei, Yingying
    Meng, Tao
    Li, Keqin
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 266
  • [6] Semi-supervised heterogeneous graph contrastive learning with label-guided
    Li, Chao
    Sun, Guoyi
    Li, Xin
    Shan, Juan
    APPLIED INTELLIGENCE, 2024, 54 (20) : 10055 - 10071
  • [7] Adaptive Graph Learning for Semi-supervised Classification of GCNs
    Wan, Yingying
    Zhan, Mengmeng
    Li, Yangding
    DATABASES THEORY AND APPLICATIONS (ADC 2021), 2021, 12610 : 13 - 22
  • [8] Semi-supervised Short Text Classification Based On Dual-channel Data Augmentation
    Li, Jiajun
    Li, Peipei
    Hu, Xuegang
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [9] Data Augmentation for Graph Convolutional Network on Semi-supervised Classification
    Tang, Zhengzheng
    Qiao, Ziyue
    Hong, Xuehai
    Wang, Yang
    Dharejo, Fayaz Ali
    Zhou, Yuanchun
    Du, Yi
    WEB AND BIG DATA, APWEB-WAIM 2021, PT II, 2021, 12859 : 33 - 48
  • [10] Graph-based Semi-supervised Learning for Text Classification
    Widmann, Natalie
    Verberne, Suzan
    ICTIR'17: PROCEEDINGS OF THE 2017 ACM SIGIR INTERNATIONAL CONFERENCE THEORY OF INFORMATION RETRIEVAL, 2017, : 59 - 66