A machine-learning based ensemble method for anti-patterns detection

被引:27
作者
Barbez, Antoine [1 ]
Khomh, Foutse [1 ]
Gueheneuc, Yann-Gael [2 ]
机构
[1] Polytech Montreal, Montreal, PQ, Canada
[2] Concordia Univ, Dept Comp Sci & Software Engn, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Software quality; Anti-patterns; Machine learning; Ensemble methods; IDENTIFICATION; IMPACT; SMELLS; CODE;
D O I
10.1016/j.jss.2019.110486
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Anti-patterns are poor solutions to recurring design problems. Several empirical studies have highlighted their negative impact on program comprehension, maintainability, as well as fault-proneness. A variety of detection approaches have been proposed to identify their occurrences in source code. However, these approaches can identify only a subset of the occurrences and report large numbers of false positives and misses. Furthermore, a low agreement is generally observed among different approaches. Recent studies have shown the potential of machine-learning models to improve this situation. However, such algorithms require large sets of manually-produced training-data, which often limits their application in practice. In this paper, we present SMAD (SMart Aggregation of Anti-patterns Detectors), a machine-learning based ensemble method to aggregate various anti-patterns detection approaches on the basis of their internal detection rules. Thus, our method uses several detection tools to produce an improved prediction from a reasonable number of training examples. We implemented SMAD for the detection of two well known anti-patterns: God Class and Feature Envy. With the results of our experiments conducted on eight java projects, we show that: (1) Our method clearly improves the so aggregated tools; (2) SMAD significantly outperforms other ensemble methods. (C) 2019 Elsevier Inc. All rights reserved.
引用
收藏
页数:11
相关论文
共 44 条
  • [1] An Empirical Study of the Impact of Two Antipatterns, Blob and Spaghetti Code, On Program Comprehension
    Abbes, Marwen
    Khomh, Foutse
    Gueheneuc, Yann-Gael
    Antoniol, Giuliano
    [J]. 2011 15TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING (CSMR), 2011, : 181 - 190
  • [2] Amorim L, 2015, 2015 IEEE 26TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE), P261, DOI 10.1109/ISSRE.2015.7381819
  • [3] [Anonymous], IC IT AMPH
  • [4] Bergstra J, 2012, J MACH LEARN RES, V13, P281
  • [5] A Unified Framework for Cohesion Measurement in Object-Oriented Systems
    Briand L.C.
    Daly J.W.
    Wüst J.
    [J]. Empirical Software Engineering, 1998, 3 (1) : 65 - 117
  • [6] Di Nucci D, 2018, 2018 25TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2018), P612, DOI 10.1109/SANER.2018.8330266
  • [7] Dynamic Selection of Classifiers in Bug Prediction: An Adaptive Method
    Di Nucci, Dario
    Palomba, Fabio
    Oliveto, Rocco
    De Lucia, Andrea
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2017, 1 (03): : 202 - 212
  • [8] Ensemble methods in machine learning
    Dietterich, TG
    [J]. MULTIPLE CLASSIFIER SYSTEMS, 2000, 1857 : 1 - 15
  • [9] Fokaefs M, 2007, PROC IEEE INT CONF S, P467
  • [10] Identification and application of Extract Class refactorings in object-oriented systems
    Fokaefs, Marios
    Tsantalis, Nikolaos
    Stroulia, Eleni
    Chatzigeorgiou, Alexander
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2012, 85 (10) : 2241 - 2260