Machine learning approaches for automated software traceability: A systematic literature review

被引:0
作者
Alturayeif, Nouf [1 ,2 ]
Hassine, Jameleddine [1 ,3 ]
Ahmad, Irfan [1 ,4 ]
机构
[1] KFUPM, Informat & Comp Sci Dept, Dhahran 31261, Saudi Arabia
[2] Imam Abdulrahman Bin Faisal Univ, Comp Dept, Dammam 31441, Saudi Arabia
[3] Interdisciplinary Res Ctr Intelligent Secure Syst, Dhahran 31261, Saudi Arabia
[4] KFUPM, SDAIA KFUPM Joint Res Ctr Artificial Intelligence, Dhahran 31261, Saudi Arabia
关键词
Software traceability; Machine learning; Deep learning; Transfer learning; Systematic literature review; LINK RECOVERY; CODE;
D O I
10.1016/j.jss.2025.112536
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Software traceability is the process of tracking and managing relationships between software artifacts throughout the Software Development Life-Cycle (SDLC). It ensures that all software artifacts are correctly linked, facilitating change management, impact analysis, and regulatory compliance. Automated traceability can be achieved using Information Retrieval (IR) and Machine Learning (ML) approaches. This systematic literature review summarizes and synthesizes ML-based automated traceability studies. Considering the rapid ML advancements, analyzing current research is crucial for progress in the field. We identified 59 studies published between 2014 and June 2024. We found an increase in the publications, particularly in 2023 and continuing into 2024, with sustained citation impact. Around 170 datasets from different domains are used, covering natural and programming languages artifacts. Common artifacts include use cases and source code, focusing on Requirements Analysis and Implementation phases. Existing solutions mostly use classification and supervised learning, with the emerging use of deep learning and Large Language Models (LLMs), showing superior performance. We identified challenges and gaps with future trends to guide researchers. Challenges include imbalanced datasets, data scarcity, and limited real-world data, while gaps include handling missing true links, lack of benchmark datasets, and limited exploration of LLMs. Lastly, we provide recommendations for researchers based on the findings.
引用
收藏
页数:38
相关论文
共 145 条
[71]   BTLink : automatic link recovery between issues and commits based on pre-trained BERT model [J].
Lan, Jinpeng ;
Gong, Lina ;
Zhang, Jingxuan ;
Zhang, Haoxiang .
EMPIRICAL SOFTWARE ENGINEERING, 2023, 28 (04)
[72]   RCLinker: Automated Linking of Issue Reports and Commits Leveraging Rich Contextual Information [J].
Le, Tien-Duy B. ;
Linares-Vasquez, Mario ;
Lo, David ;
Poshyvanyk, Denys .
2015 IEEE 23RD INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION ICPC 2015, 2015, :36-47
[73]   Combining Machine Learning and Logical Reasoning to Improve Requirements Traceability Recovery [J].
Li, Tong ;
Wang, Shiheng ;
Lillis, David ;
Yang, Zhen .
APPLIED SCIENCES-BASEL, 2020, 10 (20) :1-23
[74]  
Li XN, 2023, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, P6219
[75]   Tracing Requirements in Software Design [J].
Li, Zeheng ;
Chen, Mingrui ;
Huang, LiGuo ;
Ng, Vincent ;
Geng, Ruili .
ICSSP'17: PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON SOFTWARE AND SYSTEM PROCESS, 2017, :25-29
[76]  
Li Zeheng., 2015, P 19 C COMPUTATIONAL, P237, DOI DOI 10.18653/V1/K15-1024
[77]   Information retrieval versus deep learning approaches for generating traceability links in bilingual projects [J].
Lin, Jinfeng ;
Liu, Yalin ;
Cleland-Huang, Jane .
EMPIRICAL SOFTWARE ENGINEERING, 2022, 27 (01)
[78]   Traceability Transformed: Generating more Accurate Links with Pre-Trained BERT Models [J].
Lin, Jinfeng ;
Liu, Yalin ;
Zeng, Qingkai ;
Jiang, Meng ;
Cleland-Huang, Jane .
2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021), 2021, :324-335
[79]   A survey of transformers [J].
Lin, Tianyang ;
Wang, Yuxin ;
Liu, Xiangyang ;
Qiu, Xipeng .
AI OPEN, 2022, 3 :111-132
[80]  
Lindvall M, 1996, SOFTWARE PRACT EXPER, V26, P1161, DOI 10.1002/(SICI)1097-024X(199610)26:10<1161::AID-SPE58>3.0.CO