Demystifying the Impact of Open-Source Machine Learning Libraries on Software Analytics

被引:0
作者
Zhao, Yu [1 ]
Gong, Yihui [2 ]
Gong, Lina [1 ]
Jiang, Shujuan [3 ]
Huang, Zhiqiu [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Key Lab Safety Crit Software, Nanjing 211106, Peoples R China
[2] Dalian Univ Technol, Inst Adv Control Technol, Dalian 116024, Peoples R China
[3] China Univ Min & Technol, Sch Comp Sci & Technol, Xuzhou 221116, Peoples R China
基金
中国国家自然科学基金;
关键词
Libraries; Software; !text type='Python']Python[!/text; Predictive models; Maximum likelihood estimation; Mathematical models; Codes; Software quality; Bayes methods; Support vector machines; Empirical software engineering (SE); machine learning (ML); model interpretation; software analytics; DEFECT PREDICTION; FRAMEWORK;
D O I
10.1109/TR.2024.3455390
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Machine learning (ML) classification techniques from various libraries have been widely introduced into software engineering (SE) to mine instructive insights, which help developers guarantee software quality. However, researchers would report instructive insights with no clear stability and consensus due to the indiscriminate use of various ML libraries. Such a lack of directive on using various ML libraries prevents developers from effectively applying instructive insights in practice. Therefore, through a case study of 23 popular software datasets across three task domains (i.e., software defect prediction, issue lifetime estimation, and code smell detection) in SE, in this article, we systematically study the impact of open-source ML libraries on performance consistency, performance stability, and model interpretation of six commonly used classifiers across two commonly used ML programming language (i.e., Python and R). We find that for a given classification technique: ML libraries with the tune setting cannot generate stable and consistent performance; ML libraries are sensitive in the interpretation of the model, even in case of the same parameter settings; and ML libraries from R would generate more stable and higher performance. Based on these findings, we suggest that future work in SE should indicate the specific ML libraries with the specific parameter settings that are used to discover the instructive insights for their tasks; try the ML libraries from R to build the models with high and stable performance for the software tasks; and use the same ML libraries that are used to select important features to construct the classification model.
引用
收藏
页数:15
相关论文
共 85 条
  • [1] A Machine Learning Approach to Improve the Detection of CI Skip Commits
    Abdalkareem, Rabe
    Mujahid, Suhaib
    Shihab, Emad
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2021, 47 (12) : 2740 - 2754
  • [2] Simpler Hyperparameter Optimization for Software Analytics: Why, How, When?
    Agrawal, Amritanshu
    Yang, Xueqi
    Agrawal, Rishabh
    Yedida, Rahul
    Shen, Xipeng
    Menzies, Tim
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (08) : 2939 - 2954
  • [3] User's guide to correlation coefficients
    Akoglu, Haldun
    [J]. TURKISH JOURNAL OF EMERGENCY MEDICINE, 2018, 18 (03): : 91 - 93
  • [4] A Novel Approach to Tracing Safety Requirements and State-Based Design Models
    Alenazi, Mounifah
    Niu, Nan
    Savolainen, Juha
    [J]. 2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020), 2020, : 848 - 860
  • [5] On the Naming of Methods: A Survey of Professional Developers
    Alsuhaibani, Reem S.
    Newman, Christian D.
    Decker, Michael J.
    Collard, Michael L.
    Maletic, Jonathan, I
    [J]. 2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021), 2021, : 587 - 599
  • [6] [Anonymous], 2007, MSR, DOI DOI 10.1109/MSR.2007.13.HTTPS://IEEEXPLORE.IEEE.ORG/DOCUMENT/4228638
  • [7] Machine learning techniques for code smell detection: A systematic literature review and meta-analysis
    Azeem, Muhammad Ilyas
    Palomba, Fabio
    Shi, Lin
    Wang, Qing
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2019, 108 : 115 - 138
  • [8] Bachmann Adrian, 2010, Proceedings of the 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), P62, DOI 10.1109/MSR.2010.5463286
  • [9] DeCaf: Diagnosing and Triaging Performance Issues in Large-Scale Cloud Services
    Bansal, Chetan
    Renganathan, Sundararajan
    Asudani, Ashima
    Midy, Olivier
    Janakiraman, Mathru
    [J]. 2020 IEEE/ACM 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE (ICSE-SEIP), 2020, : 201 - 210
  • [10] Bergstra J, 2012, J MACH LEARN RES, V13, P281