Detecting Bad Smells with Machine Learning Algorithms: an Empirical Study

被引:23
作者
Cruz, Daniel [1 ]
Santana, Amanda [1 ]
Figueiredo, Eduardo [1 ]
机构
[1] Univ Fed Minas Gerais, Belo Horizonte, MG, Brazil
来源
2020 IEEE/ACM INTERNATIONAL CONFERENCE ON TECHNICAL DEBT, TECHDEBT | 2020年
关键词
bad smells detection; machine learning; software quality; software measurement; empirical software engineering; CODE SMELLS; MAINTAINABILITY; METRICS; IMPACT;
D O I
10.1145/3387906.3388618
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Bad smells are symptoms of bad design choices implemented on the source code. They are one of the key indicators of technical debts, specifically, design debt. To manage this kind of debt, it is important to be aware of bad smells and refactor them whenever possible. Therefore, several bad smell detection tools and techniques have been proposed over the years. These tools and techniques present different strategies to perform detections. More recently, machine learning algorithms have also been proposed to support bad smell detection. However, we lack empirical evidence on the accuracy and efficiency of these machine learning based techniques. In this paper, we present an evaluation of seven different machine learning algorithms on the task of detecting four types of bad smells. We also provide an analysis of the impact of software metrics for bad smell detection using a unified approach for interpreting the models' decisions. We found that with the right optimization, machine learning algorithms can achieve good performance (F1 score) for two bad smells: God Class (0.86) and Refused Parent Bequest (0.67). We also uncovered which metrics play fundamental roles for detecting each bad smell.
引用
收藏
页码:31 / 40
页数:10
相关论文
共 55 条
  • [1] An Empirical Study of the Impact of Two Antipatterns, Blob and Spaghetti Code, On Program Comprehension
    Abbes, Marwen
    Khomh, Foutse
    Gueheneuc, Yann-Gael
    Antoniol, Giuliano
    [J]. 2011 15TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING (CSMR), 2011, : 181 - 190
  • [2] Amorim L, 2015, 2015 IEEE 26TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE), P261, DOI 10.1109/ISSRE.2015.7381819
  • [3] Aniche M., 2015, JAVA CODE METRICS CA
  • [4] [Anonymous], 2018, Addison-Wesley Signature Series
  • [5] [Anonymous], 2020, DET BAD SMELLS MACH
  • [6] Baehrens D, 2010, J MACH LEARN RES, V11, P1803
  • [7] Quantitative Evaluation of Software Quality Metrics in Open-Source Projects
    Barkmann, Henrike
    Lincke, Rudiger
    Lowe, Welf
    [J]. 2009 INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS WORKSHOPS: WAINA, VOLS 1 AND 2, 2009, : 1067 - 1072
  • [8] Bergstra J, 2012, J MACH LEARN RES, V13, P281
  • [9] Bieman J. M., 1995, SIGSOFT Software Engineering Notes, P259, DOI 10.1145/223427.211856
  • [10] SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation
    Blewitt, Marnie E.
    Gendrel, Anne-Valerie
    Pang, Zhenyi
    Sparrow, Duncan B.
    Whitelaw, Nadia
    Craig, Jeffrey M.
    Apedaile, Anwyn
    Hilton, Douglas J.
    Dunwoodie, Sally L.
    Brockdorff, Neil
    Kay, Graham F.
    Whitelaw, Emma
    [J]. NATURE GENETICS, 2008, 40 (05) : 663 - 669