A Study on Performance Enhancement by Integrating Neural Topic Attention with Transformer-Based Language Model

被引:1
|
作者
Um, Taehum [1 ]
Kim, Namhyoung [1 ]
机构
[1] Gachon Univ, Dept Appl Stat, 1342 Seongnam Daero, Seongnam 13120, South Korea
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 17期
基金
新加坡国家研究基金会;
关键词
natural language processing; neural topic model; ELECTRA; ALBERT; multi-classification;
D O I
10.3390/app14177898
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
As an extension of the transformer architecture, the BERT model has introduced a new paradigm for natural language processing, achieving impressive results in various downstream tasks. However, high-performance BERT-based models-such as ELECTRA, ALBERT, and RoBERTa-suffer from limitations such as poor continuous learning capability and insufficient understanding of domain-specific documents. To address these issues, we propose the use of an attention mechanism to combine BERT-based models with neural topic models. Unlike traditional stochastic topic modeling, neural topic modeling employs artificial neural networks to learn topic representations. Furthermore, neural topic models can be integrated with other neural models and trained to identify latent variables in documents, thereby enabling BERT-based models to sufficiently comprehend the contexts of specific fields. We conducted experiments on three datasets-Movie Review Dataset (MRD), 20Newsgroups, and YELP-to evaluate our model's performance. Compared to the vanilla model, the proposed model achieved an accuracy improvement of 1-2% for the ALBERT model in multiclassification tasks across all three datasets, while the ELECTRA model showed an accuracy improvement of less than 1%.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Performance Evaluation of Transformer-based NLP Models on Fake News Detection Datasets
    Babu, Raveen Narendra
    Lung, Chung-Horng
    Zaman, Marzia
    2023 IEEE 47TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE, COMPSAC, 2023, : 316 - 321
  • [32] Predicting Generalized Anxiety Disorder From Impromptu Speech Transcripts Using Context-Aware Transformer-Based Neural Networks: Model Evaluation Study
    Teferra, Bazen Gashaw
    Rose, Jonathan
    JMIR MENTAL HEALTH, 2023, 10
  • [33] MuLan-Methyl-multiple transformer-based language models for accurate DNA methylation prediction
    Zeng, Wenhuan
    Gautam, Anupam
    Huson, Daniel H.
    GIGASCIENCE, 2023, 12
  • [34] RoBIn: A Transformer-based model for risk of bias inference with machine reading comprehension
    Dias, Abel Correa
    Moreira, Viviane Pereira
    Comba, Joao Luiz Dihl
    JOURNAL OF BIOMEDICAL INFORMATICS, 2025, 166
  • [35] MuLan-Methyl-multiple transformer-based language models for accurate DNA methylation prediction
    Zeng, Wenhuan
    Gautam, Anupam
    Huson, Daniel H.
    GIGASCIENCE, 2023, 12
  • [36] MuLan-Methyl-multiple transformer-based language models for accurate DNA methylation prediction
    Zeng, Wenhuan
    Gautam, Anupam
    Huson, Daniel H.
    GIGASCIENCE, 2023, 12
  • [37] Sentiment Mining in E-Commerce: The Transformer-based Deep Learning Model
    Alsaedi, Tahani
    Nawaz, Asif
    Alahmadi, Abdulrahman
    Rana, Muhammad Rizwan Rashid
    Raza, Ammar
    INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2024, 15 (08) : 641 - 650
  • [38] A Neural Language Model with a Modified Attention Mechanism for Software Code
    Zhang, Xian
    Ben, Kerong
    PROCEEDINGS OF 2018 IEEE 9TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2018, : 232 - 236
  • [39] A Transformer-Based Substitute Recommendation Model IncorporatingWeakly Supervised Customer Behavior Data
    Ye, Wenting
    Yang, Hongfei
    Zhao, Shuai
    Fang, Haoyang
    Shi, Xingjian
    Neppalli, Naveen
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 3325 - 3329
  • [40] Identification of Dietary Supplement Use from Electronic Health Records Using Transformer-based Language Models
    Zhou, Sicheng
    Schutte, Dalton
    Xing, Aiwen
    Chen, Jiyang
    Wolfson, Julian
    He, Zhe
    Yu, Fang
    Zhang, Rui
    2021 IEEE 9TH INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2021), 2021, : 513 - 514