Comparing and experimenting machine learning techniques for code smell detection

被引:257
|
作者
Fontana, Francesca Arcelli [1 ]
Mantyla, Mika V. [4 ,5 ]
Zanoni, Marco [2 ]
Marino, Alessandro [3 ]
机构
[1] Univ Milano Bicocca, Dept Comp Sci, Milan, Italy
[2] Univ Milano Bicocca, Dept Informat Syst & Commun, Milan, Italy
[3] Univ Milano Bicocca, Milan, Italy
[4] Univ Oulu, Software Engn, Oulu, Finland
[5] Aalto Univ, Helsinki, Finland
关键词
Code smells detection; Machine learning techniques; Benchmark for code smell detection; BAD SMELLS; QUALITY; CLASSIFICATION;
D O I
10.1007/s10664-015-9378-4
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Several code smell detection tools have been developed providing different results, because smells can be subjectively interpreted, and hence detected, in different ways. In this paper, we perform the largest experiment of applying machine learning algorithms to code smells to the best of our knowledge. We experiment 16 different machine-learning algorithms on four code smells (Data Class, Large Class, Feature Envy, Long Method) and 74 software systems, with 1986 manually validated code smell samples. We found that all algorithms achieved high performances in the cross-validation data set, yet the highest performances were obtained by J48 and Random Forest, while the worst performance were achieved by support vector machines. However, the lower prevalence of code smells, i.e., imbalanced data, in the entire data set caused varying performances that need to be addressed in the future studies. We conclude that the application of machine learning to the detection of these code smells can provide high accuracy (>96 %), and only a hundred training examples are needed to reach at least 95 % accuracy.
引用
收藏
页码:1143 / 1191
页数:49
相关论文
共 50 条
  • [41] Analysis of Optimized Machine Learning and Deep Learning Techniques for Spam Detection
    Hossain, Fahima
    Uddin, Mohammed Nasir
    Halder, Rajib Kumar
    2021 IEEE INTERNATIONAL IOT, ELECTRONICS AND MECHATRONICS CONFERENCE (IEMTRONICS), 2021, : 552 - 558
  • [42] Comparing Machine Learning Techniques for Predictions of Motorway Segment Crash Risk Level
    Nikolaou, Dimitrios
    Ziakopoulos, Apostolos
    Dragomanovits, Anastasios
    Roussou, Julia
    Yannis, George
    SAFETY, 2023, 9 (02)
  • [43] Hyperspectral Image Analysis and Machine Learning Techniques for Crop Disease Detection and Identification: A Review
    Garcia-Vera, Yimy E.
    Poloche-Arango, Andres
    Mendivelso-Fajardo, Camilo A.
    Gutierrez-Bernal, Felix J.
    SUSTAINABILITY, 2024, 16 (14)
  • [44] Analyzing and comparing the effectiveness of malware detection: A study of machine learning approaches
    Azeem, Muhammad
    Khan, Danish
    Iftikhar, Saman
    Bawazeer, Shaikhan
    Alzahrani, Mohammed
    HELIYON, 2024, 10 (01)
  • [45] A Review on Machine Learning Classification Techniques for Plant Disease Detection
    Shruthi, U.
    Nagaveni, V
    Raghavendra, B. K.
    2019 5TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING & COMMUNICATION SYSTEMS (ICACCS), 2019, : 281 - 284
  • [46] Application of machine learning techniques in rice leaf disease detection
    Pallathadka, Harikumar
    Ravipati, Pavankumar
    Sajja, Guna Sekhar
    Phasinam, Khongdet
    Kassanuk, Thanwamas
    Sanchez, Domenic T.
    Prabhu, P.
    MATERIALS TODAY-PROCEEDINGS, 2022, 51 : 2277 - 2280
  • [47] Machine Learning Techniques for the Detection of Inappropriate Erotic Content in Text
    Molpeceres Barrientos, Gonzalo
    Alaiz-Rodriguez, Rocio
    Gonzalez-Castro, Victor
    Parnell, Andrew C.
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2020, 13 (01) : 591 - 603
  • [48] Application of Machine Learning Techniques for Detection and Segmentation of Brain Tumors
    Nayak B.
    Dash G.P.
    Ojha R.K.
    Mishra S.K.
    SN Computer Science, 4 (5)
  • [49] A Comprehensive Study of Machine Learning Techniques for Diabetic Retinopathy Detection
    Kumari, Rachna
    Kumar, Sanjeev
    Godara, Sunila
    INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING AND COMMUNICATIONS, ICICC 2022, VOL 3, 2023, 492 : 161 - 183
  • [50] Improvization of Arrhythmia Detection Using Machine Learning and Preprocessing Techniques
    Babbar, Sarthak
    Kulshrestha, Sudhanshu
    Shangle, Kartik
    Dewan, Navroz
    Kesarwani, Saommya
    APPLICATIONS OF ARTIFICIAL INTELLIGENCE TECHNIQUES IN ENGINEERING, VOL 2, 2019, 697 : 537 - 550