PL-Transformer: a POS-aware and layer ensemble transformer for text classification

被引:3
作者
Shi, Yu [1 ]
Zhang, Xi [1 ]
Yu, Ning [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Key Lab Trustworthy Distributed Comp & Serv BUPT, Minist Educ, Beijing, Peoples R China
关键词
Text classification; Transformer; Part-of-speech; Layer ensemble;
D O I
10.1007/s00521-022-07872-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The transformer-based models have become the de-facto standard for natural language processing (NLP) tasks. However, most of these models are only designed to capture the implicit semantics among tokens without considering the extra off-the-shelf knowledge (e.g., parts-of-speech) to facilitate the NLP tasks. Additionally, despite using multiple attention-based encoders, they only utilize the embeddings from the last layer, ignoring that from other layers. To address these issues, in this paper, we propose a novel POS-aware and layer ensemble transformer neural network (named as PL-Transformer). PL-Transformer utilizes the parts-of-speech information explicitly and leverages the outputs from different encoder layers with correlation coefficient attention (C-Encoder) jointly. Moreover, we use correlation coefficient attention to bound dot product in C-Encoder, which improves the overall model performance. Extensive experiments on four datasets demonstrate that PL-Transformer can improve the text classification performance. For example, the accuracy on the MPQA dataset is improved by 3.95% over the vanilla transformer.
引用
收藏
页码:1971 / 1982
页数:12
相关论文
共 44 条
  • [41] SAM-CTMapper: Utilizing segment anything model and scale-aware mixed CNN-Transformer facilitates coastal wetland hyperspectral image classification
    Zou, Jiaqi
    He, Wei
    Wang, Haifeng
    Zhang, Hongyan
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2025, 139
  • [42] Dual representations: A novel variant of Self-Supervised Audio Spectrogram Transformer with multi-layer feature fusion and pooling combinations for sound classification
    Choi, Hyosun
    Zhang, Li
    Watkins, Chris
    NEUROCOMPUTING, 2025, 623
  • [43] Stock Selection via Expand-excite Cony Attention Autoencoder and Layer Sparse Attention Transformer :A Classification Approach-Inspire Time Series Sequence Recognition
    Fu, Wentao
    Sun, Jifeng
    Jiang, Yong
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [44] Uncertainty-aware automatic TNM staging classification for [18F] Fluorodeoxyglucose PET-CT reports for lung cancer utilising transformer-based language models and multi-task learning
    Barlow, Stephen H.
    Chicklore, Sugama
    He, Yulan
    Ourselin, Sebastien
    Wagner, Thomas
    Barnes, Anna
    Cook, Gary J. R.
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2024, 24 (01)