PL-Transformer: a POS-aware and layer ensemble transformer for text classification

被引：3

作者：

Shi, Yu ^{[1
]}

Zhang, Xi ^{[1
]}

Yu, Ning ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommun, Key Lab Trustworthy Distributed Comp & Serv BUPT, Minist Educ, Beijing, Peoples R China

来源：

NEURAL COMPUTING & APPLICATIONS | 2023年 / 35卷 / 02期

关键词：

Text classification; Transformer; Part-of-speech; Layer ensemble;

D O I：

10.1007/s00521-022-07872-4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The transformer-based models have become the de-facto standard for natural language processing (NLP) tasks. However, most of these models are only designed to capture the implicit semantics among tokens without considering the extra off-the-shelf knowledge (e.g., parts-of-speech) to facilitate the NLP tasks. Additionally, despite using multiple attention-based encoders, they only utilize the embeddings from the last layer, ignoring that from other layers. To address these issues, in this paper, we propose a novel POS-aware and layer ensemble transformer neural network (named as PL-Transformer). PL-Transformer utilizes the parts-of-speech information explicitly and leverages the outputs from different encoder layers with correlation coefficient attention (C-Encoder) jointly. Moreover, we use correlation coefficient attention to bound dot product in C-Encoder, which improves the overall model performance. Extensive experiments on four datasets demonstrate that PL-Transformer can improve the text classification performance. For example, the accuracy on the MPQA dataset is improved by 3.95% over the vanilla transformer.

引用

页码：1971 / 1982

页数：12

共 44 条

[41] SAM-CTMapper: Utilizing segment anything model and scale-aware mixed CNN-Transformer facilitates coastal wetland hyperspectral image classification
Zou, Jiaqi
He, Wei
Wang, Haifeng
Zhang, Hongyan
INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2025, 139
[42] Dual representations: A novel variant of Self-Supervised Audio Spectrogram Transformer with multi-layer feature fusion and pooling combinations for sound classification
Choi, Hyosun
Zhang, Li
Watkins, Chris
NEUROCOMPUTING, 2025, 623
[43] Stock Selection via Expand-excite Cony Attention Autoencoder and Layer Sparse Attention Transformer :A Classification Approach-Inspire Time Series Sequence Recognition
Fu, Wentao
Sun, Jifeng
Jiang, Yong
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[44] Uncertainty-aware automatic TNM staging classification for [18F] Fluorodeoxyglucose PET-CT reports for lung cancer utilising transformer-based language models and multi-task learning
Barlow, Stephen H.
Chicklore, Sugama
He, Yulan
Ourselin, Sebastien
Wagner, Thomas
Barnes, Anna
Cook, Gary J. R.
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2024, 24 (01)

← 1 2 3 4 5 →