Pre-trained Model Based Feature Envy Detection

被引:5
|
作者
Ma, Wenhao [1 ]
Yu, Yaoxiang [1 ]
Ruan, Xiaoming [1 ]
Cai, Bo [1 ]
机构
[1] Wuhan Univ, Sch Cyber Sci & Engn, Key Lab Aerosp Informat Secur & Trusted Comp, Minist Educ, Wuhan, Peoples R China
关键词
Feature Envy; Deep Learning; Software Refactoring; Pre-trained Model; Code Smell; CODE;
D O I
10.1109/MSR59073.2023.00065
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Code smells slow down software system development and makes them harder to maintain. Existing research aims to develop automatic detection algorithms to reduce the labor and time costs within the detection process. Deep learning techniques have recently been demonstrated to enhance the performance of recognizing code smells even more than metric-based heuristic detection algorithms. As large-scale pre-trained models for Programming Languages (PL), such as CodeT5, have lately achieved the top results in a variety of downstream tasks, some researchers begin to explore the use of pre-trained models to extract the contextual semantics of code to detect code smells. However, little research has employed contextual code semantics relationship between code snippets obtained by pre-trained models to identify code smells. In this paper, we investigate the use of the pretrained model CodeT5 to extract semantic relationships between code snippets to detect feature envy, which is one of the most common code smells. In addition, to investigate the performance of these semantic relationships extracted by pre-trained models of different architectures on detecting feature envy, we compare CodeT5 with two other pre-trained models CodeBERT and CodeGPT. We have performed our experimental evaluation on ten open-source projects, our approach improves F-measure by 29.32% on feature envy detection and 16.57% on moving destination recommendation. Using semantic relations extracted by several pre-trained models to detect feature envy outperforms the state-of-the-art. This shows that using this semantic relation to detect feature envy is promising. To enable future research on feature envy detection, we have made all the code and datasets utilized in this article open source.
引用
收藏
页码:430 / 440
页数:11
相关论文
共 50 条
  • [1] Software Vulnerabilities Detection Based on a Pre-trained Language Model
    Xu, Wenlin
    Li, Tong
    Wang, Jinsong
    Duan, Haibo
    Tang, Yahui
    2023 IEEE 22ND INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, BIGDATASE, CSE, EUC, ISCI 2023, 2024, : 904 - 911
  • [2] Web-FTP: A Feature Transferring-Based Pre-Trained Model for Web Attack Detection
    Guo, Zhenyu
    Shang, Qinghua
    Li, Xin
    Li, Chengyi
    Zhang, Zijian
    Zhang, Zhuo
    Hu, Jingjing
    An, Jincheng
    Huang, Chuanming
    Chen, Yang
    Cai, Yuguang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2025, 37 (03) : 1495 - 1507
  • [3] Continual Learning with Bayesian Model Based on a Fixed Pre-trained Feature Extractor
    Yang, Yang
    Cui, Zhiying
    Xu, Junjie
    Zhong, Changhong
    Wang, Ruixuan
    Zheng, Wei-Shi
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT V, 2021, 12905 : 397 - 406
  • [4] Continual learning with Bayesian model based on a fixed pre-trained feature extractor
    Yang Yang
    Zhiying Cui
    Junjie Xu
    Changhong Zhong
    Wei-Shi Zheng
    Ruixuan Wang
    Visual Intelligence, 1 (1):
  • [5] Data Augmentation Based on Pre-trained Language Model for Event Detection
    Zhang, Meng
    Xie, Zhiwen
    Liu, Jin
    CCKS 2021 - EVALUATION TRACK, 2022, 1553 : 59 - 68
  • [6] Detection of Chinese Deceptive Reviews Based on Pre-Trained Language Model
    Weng, Chia-Hsien
    Lin, Kuan-Cheng
    Ying, Jia-Ching
    APPLIED SCIENCES-BASEL, 2022, 12 (07):
  • [7] Pre-trained convolutional neural networks as feature extractors for tuberculosis detection
    Lopes, U. K.
    Valiati, J. F.
    COMPUTERS IN BIOLOGY AND MEDICINE, 2017, 89 : 135 - 143
  • [8] Style Change Detection: Method Based On Pre-trained Model And Similarity Recognition
    Foshan University, Foshan, China
    CEUR Workshop Proc., (2526-2531):
  • [9] Detection of Unstructured Sensitive Data Based on a Pre-Trained Model and Lattice Transformer
    Jin, Feng
    Wu, Shaozhi
    Liu, Xingang
    Su, Han
    Tian, Miao
    2024 7TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA, ICAIBD 2024, 2024, : 180 - 185
  • [10] Feature Mixture on Pre-Trained Model for Few-Shot Learning
    Wang, Shuo
    Lu, Jinda
    Xu, Haiyang
    Hao, Yanbin
    He, Xiangnan
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 4104 - 4115