On the Validity of Pre-Trained Transformers for Natural Language Processing in the Software Engineering Domain

被引:35
|
作者
von der Mosel, Julian [1 ]
Trautsch, Alexander [1 ]
Herbold, Steffen [2 ]
机构
[1] Univ Gottingen, Inst Comp Sci, D-30332 Gottingen, GA, Germany
[2] Tech Univ Clausthal, Inst Software & Syst Engn, D-38678 Clausthal Zellerfeld, Germany
关键词
Transformers; Task analysis; Context modeling; Bit error rate; Data models; Software engineering; Adaptation models; Natural language processing; software engineering; transformers; SENTIMENT ANALYSIS;
D O I
10.1109/TSE.2022.3178469
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Transformers are the current state-of-the-art of natural language processing in many domains and are using traction within software engineering research as well. Such models are pre-trained on large amounts of data, usually from the general domain. However, we only have a limited understanding regarding the validity of transformers within the software engineering domain, i.e., how good such models are at understanding words and sentences within a software engineering context and how this improves the state-of-the-art. Within this article, we shed light on this complex, but crucial issue. We compare BERT transformer models trained with software engineering data with transformers based on general domain data in multiple dimensions: their vocabulary, their ability to understand which words are missing, and their performance in classification tasks. Our results show that for tasks that require understanding of the software engineering context, pre-training with software engineering data is valuable, while general domain models are sufficient for general language understanding, also within the software engineering domain.
引用
收藏
页码:1487 / 1507
页数:21
相关论文
共 50 条
  • [1] Pre-trained models for natural language processing: A survey
    Qiu XiPeng
    Sun TianXiang
    Xu YiGe
    Shao YunFan
    Dai Ning
    Huang XuanJing
    SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2020, 63 (10) : 1872 - 1897
  • [2] Pre-trained models for natural language processing: A survey
    XiPeng Qiu
    TianXiang Sun
    YiGe Xu
    YunFan Shao
    Ning Dai
    XuanJing Huang
    Science China Technological Sciences, 2020, 63 : 1872 - 1897
  • [3] Pre-trained Language Models in Biomedical Domain: A Systematic Survey
    Wang, Benyou
    Xie, Qianqian
    Pei, Jiahuan
    Chen, Zhihong
    Tiwari, Prayag
    Li, Zhao
    Fu, Jie
    ACM COMPUTING SURVEYS, 2024, 56 (03)
  • [4] ARoBERT: An ASR Robust Pre-Trained Language Model for Spoken Language Understanding
    Wang, Chengyu
    Dai, Suyang
    Wang, Yipeng
    Yang, Fei
    Qiu, Minghui
    Chen, Kehan
    Zhou, Wei
    Huang, Jun
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1207 - 1218
  • [5] Harnessing pre-trained generalist agents for software engineering tasks
    Mindom, Paulina Stevia Nouwou
    Nikanjam, Amin
    Khomh, Foutse
    EMPIRICAL SOFTWARE ENGINEERING, 2025, 30 (01)
  • [6] Leveraging Pre-Trained Language Model for Summary Generation on Short Text
    Zhao, Shuai
    You, Fucheng
    Liu, Zeng Yuan
    IEEE ACCESS, 2020, 8 : 228798 - 228803
  • [7] Addressing Extraction and Generation Separately: Keyphrase Prediction With Pre-Trained Language Models
    Liu, Rui
    Lin, Zheng
    Wang, Weiping
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3180 - 3191
  • [8] Face Inpainting with Pre-trained Image Transformers
    Gonc, Kaan
    Saglam, Baturay
    Kozat, Suleyman S.
    Dibeklioglu, Hamdi
    2022 30TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2022,
  • [9] Surgicberta: a pre-trained language model for procedural surgical language
    Bombieri, Marco
    Rospocher, Marco
    Ponzetto, Simone Paolo
    Fiorini, Paolo
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024, 18 (01) : 69 - 81
  • [10] Pre-trained transformers: an empirical comparison
    Casola, Silvia
    Lauriola, Ivano
    Lavelli, Alberto
    MACHINE LEARNING WITH APPLICATIONS, 2022, 9