Applications of natural language processing in software traceability: A systematic mapping study?

被引:16
作者
Pauzi, Zaki [1 ]
Capiluppi, Andrea [1 ]
机构
[1] Univ Groningen, Bernoulli Inst, Nijenborgh 9, NL-9747 AG Groningen, Netherlands
关键词
Software traceability; Information retrieval; Natural language processing; SOURCE-CODE; BUG REPORTS; LINKS; REQUIREMENTS; INFORMATION; LOCATION; TIQI;
D O I
10.1016/j.jss.2023.111616
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
A key part of software evolution and maintenance is the continuous integration from collaborative efforts, often resulting in complex traceability challenges between software artifacts: features and modules remain scattered in the source code, and traceability links become harder to recover. In this paper, we perform a systematic mapping study dealing with recent research recovering these links through information retrieval, with a particular focus on natural language processing (NLP). Our search strategy gathered a total of 96 papers in focus of our study, covering a period from 2013 to 2021. We conducted trend analysis on NLP techniques and tools involved, and traceability efforts (applying NLP) across the software development life cycle (SDLC). Based on our study, we have identified the following key issues, barriers, and setbacks: syntax convention, configuration, translation, explainability, properties representation, tacit knowledge dependency, scalability, and data availability. Based on these, we consolidated the following open challenges: representation similarity across artifacts, the effectiveness of NLP for traceability, and achieving scalable, adaptive, and explainable models. To address these challenges, we recommend a holistic framework for NLP solutions to achieve effective traceability and efforts in achieving interoperability and explainability in NLP models for traceability. (c) 2023 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
引用
收藏
页数:19
相关论文
共 149 条
[1]   A Comparison of Natural Language Understanding Platforms for Chatbots in Software Engineering [J].
Abdellatif, Ahmad ;
Badran, Khaled ;
Costa, Diego Elias ;
Shihab, Emad .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (08) :3087-3102
[2]   Citations, Citation Indicators, and Research Quality: An Overview of Basic Concepts and Theories [J].
Aksnes, Dag W. ;
Langfeldt, Liv ;
Wouters, Paul .
SAGE OPEN, 2019, 9 (01)
[3]  
Alazzam I., 2014, INT J STW ENG APPL, V8, P203, DOI DOI 10.14257/IJSEIA.2014.8.1.18
[4]   Exploiting Parts-of-Speech for effective automated requirements traceability [J].
Ali, Nasir ;
Cai, Haipeng ;
Hamou-Lhadj, Abdelwahab ;
Hassine, Jameleddine .
INFORMATION AND SOFTWARE TECHNOLOGY, 2019, 106 :126-141
[5]   An empirical study on the importance of source code entities for requirements traceability [J].
Ali, Nasir ;
Sharafi, Zohreh ;
Gueheneuc, Yann-Gael ;
Antoniol, Giuliano .
EMPIRICAL SOFTWARE ENGINEERING, 2015, 20 (02) :442-478
[6]  
Alobaidi M., 2015, 2015 INT C SOFTWARE, P190
[7]   Combining Deep Learning with Information Retrieval to Localize Buggy Files for Bug Reports [J].
An Ngoc Lam ;
Anh Tuan Nguyen ;
Hoan Anh Nguyen ;
Nguyen, Tien N. .
2015 30TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE), 2015, :476-481
[8]   Automated Checking of Conformance to Requirements Templates Using Natural Language Processing [J].
Arora, Chetan ;
Sabetzadeh, Mehrdad ;
Briand, Lionel ;
Zimmer, Frank .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2015, 41 (10) :944-968
[9]  
Arunthavanathan A, 2016, 2ND INTERNATIONAL MERCON 2016 MORATUWA ENGINEERING RESEARCH CONFERENCE, P18, DOI 10.1109/MERCon.2016.7480109
[10]   ATLaS: A Framework for Traceability Links Recovery Combining Information Retrieval and Semi-supervised Techniques [J].
Bella, Emma Effa ;
Creff, Stephen ;
Gervais, Marie-Pierre ;
Bendraou, Reda .
2019 IEEE 23RD INTERNATIONAL ENTERPRISE DISTRIBUTED OBJECT COMPUTING CONFERENCE (EDOC), 2019, :161-170