Heterogeneous graph contrastive learning with adaptive data augmentation for semi-supervised short text classification

被引:0
|
作者
Wu, Mingqiang [1 ]
Xu, Zhuoming [1 ]
Zheng, Lei [1 ]
机构
[1] Hohai Univ, Coll Comp Sci & Software Engn, Nanjing 211100, Jiangsu, Peoples R China
关键词
data augmentation; heterogeneous graph contrastive learning; semi-supervised short text classification; short text clustering; soft prompt;
D O I
10.1111/exsy.13744
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Short text classification has been widely used in many fields. Due to the scarcity of labelled data, implementing short text classification under semi-supervised learning setting has become increasingly popular. Semi-supervised short text classification methods based on graph neural networks can achieve state-of-the-art classification performance by utilizing the expressive power of graph neural networks. However, these methods usually fail to mine the hidden patterns of a large amount of short text node data in the graph to optimize the short text node embeddings, which limits the semantic representation power of the short texts, thus leading to suboptimal classification performance. To overcome the limitation, this paper proposes a novel semi-supervised short text classification method called the Heterogeneous Graph Contrastive Learning with Adaptive Data Augmentation (HGCLADA). In the knowledge bases guided soft prompt-based data augmentation component, the related words of the tag words are used to optimize the soft prompts for generating diverse augmented samples. In the heterogeneous graph contrastive learning framework component, a heterogeneous graph that is constructed using short texts and keywords and an effective edge augmentation scheme based on a short text clustering algorithm are proposed. The optimized short text embeddings can be obtained to achieve the effective semi-supervised short text classification. Extensive experiments on six benchmark datasets show that our HGCLADA method outperforms four classes of state-of-the-art methods in terms of classification accuracy, especially with significant performance improvements of 8.74% on the TagMyNews dataset when each class only contains 20 labelled data.
引用
收藏
页数:28
相关论文
共 50 条
  • [31] Deep graph learning for semi-supervised classification
    Lin, Guangfeng
    Kang, Xiaobing
    Liao, Kaiyang
    Zhao, Fan
    Chen, Yajun
    PATTERN RECOGNITION, 2021, 118
  • [32] Attention decoupled contrastive learning for semi-supervised segmentation method based on data augmentation
    Pan, Pan
    Chen, Houjin
    Li, Yanfeng
    Peng, Wanru
    Cheng, Lin
    PHYSICS IN MEDICINE AND BIOLOGY, 2024, 69 (12):
  • [33] Dynamic graph convolutional networks by semi-supervised contrastive learning
    Zhang, Guolin
    Hu, Zehui
    Wen, Guoqiu
    Ma, Junbo
    Zhu, Xiaofeng
    PATTERN RECOGNITION, 2023, 139
  • [34] Pseudo Contrastive Learning for graph-based semi-supervised learning
    Lu, Weigang
    Guan, Ziyu
    Zhao, Wei
    Yang, Yaming
    Lv, Yuanhai
    Xing, Lining
    Yu, Baosheng
    Tao, Dacheng
    NEUROCOMPUTING, 2025, 624
  • [35] Semi-Supervised Learning with Data Augmentation for Tabular Data
    Fang, Junpeng
    Tang, Caizhi
    Cui, Qing
    Zhu, Feng
    Li, Longfei
    Zhou, Jun
    Zhu, Wei
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 3928 - 3932
  • [36] Label-invariant Augmentation for Semi-Supervised Graph Classification
    Yue, Han
    Zhang, Chunhui
    Zhang, Chuxu
    Liu, Hongfu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [37] Adaptive semi-supervised learning from stronger augmentation transformations of discrete text information
    Zhang, Xuemiao
    Tan, Zhouxing
    Lu, Fengyu
    Yan, Rui
    Liu, Junfei
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (08) : 4609 - 4629
  • [38] NodeAug: Semi-Supervised Node Classification with Data Augmentation
    Wang, Yiwei
    Wang, Wei
    Liang, Yuxuan
    Cai, Yujun
    Liu, Juncheng
    Hooi, Bryan
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 207 - 217
  • [39] SemiGraphFL: Semi-supervised Graph Federated Learning for Graph Classification
    Tao, Ye
    Li, Ying
    Wu, Zhonghai
    PARALLEL PROBLEM SOLVING FROM NATURE - PPSN XVII, PPSN 2022, PT I, 2022, 13398 : 474 - 487
  • [40] A New SVM Method for Short Text Classification Based on Semi-Supervised Learning
    Yin, Chunyong
    Xiang, Jun
    Zhang, Hui
    Wang, Jin
    Yin, Zhichao
    Kim, Jeong-Uk
    2015 4TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION TECHNOLOGY AND SENSOR APPLICATION (AITS), 2015, : 100 - 103