On the assignment of commits to releases

被引:0
作者
Felipe Curty do Rego Pinto
Leonardo Gresta Paulino Murta
机构
[1] Universidade Federal Fluminense,Instituto de Computação
来源
Empirical Software Engineering | 2023年 / 28卷
关键词
Release; Release mining; Commit assignment; Release analysis;
D O I
暂无
中图分类号
学科分类号
摘要
Release is a ubiquitous concept in software development, referring to grouping multiple independent changes into a deliverable piece of software. Mining releases can help developers understand the software evolution at coarse grain, identify which features were delivered or bugs were fixed, and pinpoint who contributed on a given release. A typical initial step of release mining consists of identifying which commits compose a given release. We could find two main strategies used in the literature to perform this task: time-based and range-based. Some release mining works recognize that those strategies are subject to misclassifications but do not quantify the impact of such a threat. This paper analyzed 13,419 releases and 1,414,997 commits from 100 relevant open-source projects hosted at GitHub to assess both strategies in terms of precision and recall. We observed that, in general, the range-based strategy has superior results than the time-based strategy. Nevertheless, even when the range-based strategy is in place, some releases still show misclassifications. Thus, our paper also discusses some situations in which each strategy degrades, potentially leading to bias on the mining results if not adequately known and avoided.
引用
收藏
相关论文
共 36 条
[1]  
Abebe SL(2015)Release Engineering 3.0 An empirical study of software release notes 21 1107-1142
[2]  
Ali N(2018)Identifying related commits from software repositories IEEE Softw 35 22-25
[3]  
Hassan AE(2018)ARENA: An Approach for the Automated Generation of Release Notes What’s in a GitHub Star? Understanding Repository Starring Practices in a Social Coding Platform 146 112-129
[4]  
Adams B(2015)Frlink: Improving the recovery of missing issue-commit links by revisiting file relevance Int J Comput Applic Technol 51 212-218
[5]  
Bellomo S(2015)Rapid Releases and Patch Backouts: A Software Analytics Approach Understanding the impact of rapid releases on software quality 20 336-373
[6]  
Bird C(2015)undefined On rapid releases and software testing: A case study and a semi-systematic literature review 20 1384-1425
[7]  
Debic B(2017)undefined IEEE Trans Softw Eng 43 106-127
[8]  
Khomh F(2021)undefined Understanding and improving the quality and reproducibility of Jupyter notebooks 26 65-47
[9]  
Moir K(2017)undefined Inform Softw Technol 84 33-96
[10]  
ODuinn J(2015)undefined IEEE Softw 32 89-undefined