A systematic process for Mining Software Repositories: Results from a systematic literature review

被引:17
作者
Vidoni, M. [1 ]
机构
[1] Australian Natl Univ, CECS Sch Comp, Canberra, ACT, Australia
关键词
Mining Software Repositories; Systematic literature review; Evidence-based software engineering; Guidelines; GITHUB; CLASSIFICATION; DEVELOPERS; PROJECTS; DATASET; FLOW;
D O I
10.1016/j.infsof.2021.106791
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Mining Software Repositories (MSR) is a growing area of Software Engineering (SE) research. Since their emergence in 2004, many investigations have analysed different aspects of these studies. However, there are no guidelines on how to conduct systematic MSR studies. There is a need to evaluate how MSR research is approached to provide a framework to do so systematically. Objective: To identify how MSR studies are conducted in terms of repository selection and data extraction. To uncover potential for improvement in directing systematic research and providing guidelines to do so. Method: A systematic literature review of MSR studies was conducted following the guidelines and template proposed by Mian et al. (which refines those provided by Kitchenham and Charters). These guidelines were extended and revised to provide a framework for systematic MSR studies. Results: MSR studies typically do not follow a systematic approach for repository selection, and many do not report selection or data extraction protocols. Furthermore, few manuscripts discuss threats to the study's validity due to the selection or data extraction steps followed. Conclusions: Although MSR studies are evidence-based research, they seldom follow a systematic process. Hence, there is a need for guidelines on how to conduct systematic MSR studies. New guidelines and a template have been proposed, consolidating related studies in the MSR field and strategies for systematic literature reviews.
引用
收藏
页数:17
相关论文
共 177 条
[1]   Mining Component Repositories for Installability Issues [J].
Abate, Pietro ;
Di Cosmo, Roberto ;
Gesbert, Louis ;
Le Fessant, Fabrice ;
Treinen, Ralf ;
Zacchiroli, Stefano .
12TH WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2015), 2015, :24-33
[2]   Using Language-Based Search in Mining Large Software Repositories [J].
Abu Bakar, Normi Sham Awang .
COMPUTATIONAL LINGUISTICS AND RELATED FIELDS, 2011, 27 :160-168
[3]   Code authorship identification using convolutional neural networks [J].
Abuhamad, Mohammed ;
Rhim, Ji-su ;
AbuHmed, Tamer ;
Ullah, Sana ;
Kang, Sanggil ;
Nyang, DaeHun .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 95 :104-115
[4]  
Thompson CA, 2016, 13TH WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2016), P281, DOI [10.1109/MSR.2016.036, 10.1145/2901739.2901779]
[5]   Tracing known security vulnerabilities in software repositories - A Semantic Web enabled modeling approach [J].
Alqahtani, Sultan S. ;
Eghan, Ellis E. ;
Rilling, Juergen .
SCIENCE OF COMPUTER PROGRAMMING, 2016, 121 :153-175
[6]   Building and mining a repository of design pattern instances: Practical and research benefits [J].
Ampatzoglou, Apostolos ;
Michou, Olia ;
Stamelos, Ioannis .
ENTERTAINMENT COMPUTING, 2013, 4 (02) :131-142
[7]   On Mining Data across Software Repositories [J].
Anbalagan, Prasanth ;
Vouk, Mladen .
2009 6TH IEEE INTERNATIONAL WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES, 2009, :171-174
[8]  
[Anonymous], 2014, 11 WORKING C MINING, DOI [10.1145/2597073.2597128, DOI 10.1145/2597073.2597128]
[9]  
[Anonymous], 2006, P INT WORKSH MIN SOF
[10]  
[Anonymous], 2005, EXPT SOFTWARE ENG TR