Mining Version Histories for Detecting Code Smells

被引:194
作者
Palomba, Fabio [1 ]
Bavota, Gabriele [2 ]
Di Penta, Massimiliano [3 ]
Oliveto, Rocco [4 ]
Poshyvanyk, Denys [5 ]
De Lucia, Andrea [1 ]
机构
[1] Univ Salerno, Fisciano, SA, Italy
[2] Free Univ Bozen, Bolzano, Italy
[3] Univ Sannio, Benevento, Italy
[4] Univ Molise, Pesche, IS, Italy
[5] Coll William & Mary, Williamsburg, VA USA
基金
美国国家科学基金会;
关键词
Code smells; mining software repositories; empirical studies; BAD SMELLS; IMPACT; ANTIPATTERNS; PROBABILITY; SYSTEM;
D O I
10.1109/TSE.2014.2372760
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Code smells are symptoms of poor design and implementation choices that may hinder code comprehension, and possibly increase change-and fault-proneness. While most of the detection techniques just rely on structural information, many code smells are intrinsically characterized by how code elements change over time. In this paper, we propose Historical Information for Smell deTection (HIST), an approach exploiting change history information to detect instances of five different code smells, namely Divergent Change, Shotgun Surgery, Parallel Inheritance, Blob, and Feature Envy. We evaluate HIST in two empirical studies. The first, conducted on 20 open source projects, aimed at assessing the accuracy of HIST in detecting instances of the code smells mentioned above. The results indicate that the precision of HIST ranges between 72 and 86 percent, and its recall ranges between 58 and 100 percent. Also, results of the first study indicate that HIST is able to identify code smells that cannot be identified by competitive approaches solely based on code analysis of a single system's snapshot. Then, we conducted a second study aimed at investigating to what extent the code smells detected by HIST (and by competitive code analysis techniques) reflect developers' perception of poor design and implementation choices. We involved 12 developers of four open source projects that recognized more than 75 percent of the code smell instances identified by HIST as actual design/implementation problems.
引用
收藏
页码:462 / 489
页数:28
相关论文
共 55 条
  • [1] An Empirical Study of the Impact of Two Antipatterns, Blob and Spaghetti Code, On Program Comprehension
    Abbes, Marwen
    Khomh, Foutse
    Gueheneuc, Yann-Gael
    Antoniol, Giuliano
    [J]. 2011 15TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING (CSMR), 2011, : 181 - 190
  • [2] Adams B., 2010, P 32 ACMIEEE INT C S, P305
  • [3] Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
  • [4] [Anonymous], 1999, Modern Information Retrieval
  • [5] Arcoverde R., 2011, P INT WORKSH REF TOO, P33, DOI DOI 10.1145/1984732.1984740
  • [6] Response rate in academic studies - A comparative analysis
    Baruch, Y
    [J]. HUMAN RELATIONS, 1999, 52 (04) : 421 - 438
  • [7] Automating extract class refactoring: an improved method and its evaluation
    Bavota, Gabriele
    De Lucia, Andrea
    Marcus, Andrian
    Oliveto, Rocco
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2014, 19 (06) : 1617 - 1664
  • [8] Boussaa Mohamed, 2013, Search Based Software Engineering. 5th International Symposium, SSBSE 2013. Proceedings: LNCS 8084, P50, DOI 10.1007/978-3-642-39742-4_6
  • [9] Brown W. H., 1998, AntiPatterns: refactoring software, architectures, and projects in crisis
  • [10] Canfora G, 2006, PROC IEEE INT CONF S, P213