Demystifying the Impact of Open-Source Machine Learning Libraries on Software Analytics

被引：0

作者：

Zhao, Yu ^{[1
]}

Gong, Yihui ^{[2
]}

Gong, Lina ^{[1
]}

Jiang, Shujuan ^{[3
]}

Huang, Zhiqiu ^{[1
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Key Lab Safety Crit Software, Nanjing 211106, Peoples R China

[2] Dalian Univ Technol, Inst Adv Control Technol, Dalian 116024, Peoples R China

[3] China Univ Min & Technol, Sch Comp Sci & Technol, Xuzhou 221116, Peoples R China

来源：

IEEE TRANSACTIONS ON RELIABILITY | 2024年

基金：

中国国家自然科学基金;

关键词：

Libraries; Software; !text type='Python']Python[!/text; Predictive models; Maximum likelihood estimation; Mathematical models; Codes; Software quality; Bayes methods; Support vector machines; Empirical software engineering (SE); machine learning (ML); model interpretation; software analytics; DEFECT PREDICTION; FRAMEWORK;

D O I：

10.1109/TR.2024.3455390

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Machine learning (ML) classification techniques from various libraries have been widely introduced into software engineering (SE) to mine instructive insights, which help developers guarantee software quality. However, researchers would report instructive insights with no clear stability and consensus due to the indiscriminate use of various ML libraries. Such a lack of directive on using various ML libraries prevents developers from effectively applying instructive insights in practice. Therefore, through a case study of 23 popular software datasets across three task domains (i.e., software defect prediction, issue lifetime estimation, and code smell detection) in SE, in this article, we systematically study the impact of open-source ML libraries on performance consistency, performance stability, and model interpretation of six commonly used classifiers across two commonly used ML programming language (i.e., Python and R). We find that for a given classification technique: ML libraries with the tune setting cannot generate stable and consistent performance; ML libraries are sensitive in the interpretation of the model, even in case of the same parameter settings; and ML libraries from R would generate more stable and higher performance. Based on these findings, we suggest that future work in SE should indicate the specific ML libraries with the specific parameter settings that are used to discover the instructive insights for their tasks; try the ML libraries from R to build the models with high and stable performance for the software tasks; and use the same ML libraries that are used to select important features to construct the classification model.

引用

页数：15

共 85 条

[1] A Machine Learning Approach to Improve the Detection of CI Skip Commits
Abdalkareem, Rabe
Mujahid, Suhaib
Shihab, Emad
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2021, 47 (12) : 2740 - 2754
[2] Simpler Hyperparameter Optimization for Software Analytics: Why, How, When?
Agrawal, Amritanshu
Yang, Xueqi
Agrawal, Rishabh
Yedida, Rahul
Shen, Xipeng
Menzies, Tim
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (08) : 2939 - 2954
[3] User's guide to correlation coefficients
Akoglu, Haldun
[J]. TURKISH JOURNAL OF EMERGENCY MEDICINE, 2018, 18 (03): : 91 - 93
[4] A Novel Approach to Tracing Safety Requirements and State-Based Design Models
Alenazi, Mounifah
Niu, Nan
Savolainen, Juha
[J]. 2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020), 2020, : 848 - 860
[5] On the Naming of Methods: A Survey of Professional Developers
Alsuhaibani, Reem S.
Newman, Christian D.
Decker, Michael J.
Collard, Michael L.
Maletic, Jonathan, I
[J]. 2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021), 2021, : 587 - 599
[6] [Anonymous], 2007, MSR, DOI DOI 10.1109/MSR.2007.13.HTTPS://IEEEXPLORE.IEEE.ORG/DOCUMENT/4228638
[7] Machine learning techniques for code smell detection: A systematic literature review and meta-analysis
Azeem, Muhammad Ilyas
Palomba, Fabio
Shi, Lin
Wang, Qing
[J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2019, 108 : 115 - 138
[8] Bachmann Adrian, 2010, Proceedings of the 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), P62, DOI 10.1109/MSR.2010.5463286
[9] DeCaf: Diagnosing and Triaging Performance Issues in Large-Scale Cloud Services
Bansal, Chetan
Renganathan, Sundararajan
Asudani, Ashima
Midy, Olivier
Janakiraman, Mathru
[J]. 2020 IEEE/ACM 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE (ICSE-SEIP), 2020, : 201 - 210
[10] Bergstra J, 2012, J MACH LEARN RES, V13, P281

← 1 2 3 4 5 6 7 8 9 →