Automatic traceability link recovery via active learning

被引:14
作者
Du, Tian-bao [1 ]
Shen, Guo-hua [1 ,2 ,3 ]
Huang, Zhi-qiu [1 ,2 ,3 ]
Yu, Yao-shen [1 ]
Wu, De-xiang [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing 211106, Peoples R China
[2] Collaborat Innovat Ctr Novel Software Technol & I, Nanjing 210093, Peoples R China
[3] Nanjing Univ Aeronaut & Astronaut, Key Lab Safety Crit Software, Nanjing 211106, Peoples R China
基金
中国国家自然科学基金;
关键词
Automatic; Traceability link recovery; Manpower; Active learning; TP311; DOCUMENTATION; CODE;
D O I
10.1631/FITEE.1900222
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Traceability link recovery (TLR) is an important and costly software task that requires humans establish relationships between source and target artifact sets within the same project. Previous research has proposed to establish traceability links by machine learning approaches. However, current machine learning approaches cannot be well applied to projects without traceability information (links), because training an effective predictive model requires humans label too many traceability links. To save manpower, we propose a new TLR approach based on active learning (AL), which is called the AL-based approach. We evaluate the AL-based approach on seven commonly used traceability datasets and compare it with an information retrieval based approach and a state-of-the-art machine learning approach. The results indicate that the AL-based approach outperforms the other two approaches in terms of F-score.
引用
收藏
页码:1217 / 1225
页数:9
相关论文
共 25 条
[1]  
[Anonymous], 2010, 2010 ACM IEEE 32 INT
[2]  
Antoniol G, 2000, PROC IEEE INT CONF S, P40, DOI 10.1109/ICSM.2000.883003
[3]  
Borg M, 2013, EMPIR SOFTW ENG, V19, P565
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[6]  
Cheng Y, 2013, PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), P1311
[7]   Utilizing supporting evidence to improve dynamic requirements traceability [J].
Cleland-Huang, J ;
Settimi, R ;
Duan, C ;
Zou, XC .
13TH IEEE INTERNATIONAL CONFERENCE ON REQUIREMENTS ENGINEERING, PROCEEDINGS, 2005, :135-144
[8]   Automated classification of non-functional requirements [J].
Cleland-Huang, Jane ;
Settimi, Raffaella ;
Zou, Xuchang ;
Solc, Peter .
REQUIREMENTS ENGINEERING, 2007, 12 (02) :103-120
[9]  
ClelandHuang J., 2010, P 32 ACM IEEE INT C, P155, DOI DOI 10.1145/1806799.1806825
[10]  
De Lucia A., 2012, Software and Systems Traceability, P71