SAGA: A Hybrid Technique to handle Imbalance Data in Software Defect Prediction

被引:1
|
作者
Malhotra, Ruchika [1 ]
Kapoor, Ritvik [1 ]
Saxena, Paridhi [1 ]
Sharma, Parth [1 ]
机构
[1] Delhi Technol Univ, Dept Comp Sci & Engn, Delhi, India
来源
11TH IEEE SYMPOSIUM ON COMPUTER APPLICATIONS & INDUSTRIAL ELECTRONICS (ISCAIE 2021) | 2021年
关键词
software defect prediction; data imbalance; ensemble; feature space partitioning; Genetic Algorithm; Synthetic Minority Oversampling; FEATURE-SELECTION; SMOTE;
D O I
10.1109/ISCAIE51753.2021.9431842
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Software defect prediction has been a concurrent topic in software quality-based research. Predictive models that identify defect prone parts of Software can be evolved from defect data and software metrics. Various studies conducted in the past have explored Machine Learning-based approaches for this purpose but the problem of handling imbalanced defect data without compromising on the model's performance remains at large. In this work, we have proposed, compared, and analyzed a hybrid technique, SAGA(SMOTE + AdaSS + Genetic Algorithm), for solving the imbalance problem faced in software defect prediction. SAGA employs ensemble classification based on feature space partitioning in conjunction with the Synthetic Minority Oversampling technique. Various parameters related to feature space partitioning are optimized using the Genetic Algorithm The values of ROC-AUC, G-mean, Balance, and Accuracy obtained on open-source datasets confirm the effectiveness of the proposed technique.
引用
收藏
页码:331 / 336
页数:6
相关论文
共 50 条
  • [41] Software defect prediction using a cost sensitive decision forest and voting, and a potential solution to the class imbalance problem
    Siers, Michael J.
    Islam, Md Zahidul
    INFORMATION SYSTEMS, 2015, 51 : 62 - 71
  • [42] A hybrid CRBA-SVM model for software defect prediction
    Li F.
    Rong X.
    Cui Z.
    International Journal of Wireless and Mobile Computing, 2016, 10 (02) : 191 - 196
  • [43] Software Defect Prediction via GIN with Hybrid Graphical Features
    Wang, Xuanye
    Lu, Lu
    Wang, Boye
    Shang, Yudong
    Yang, Hao
    2022 IEEE 22ND INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY, AND SECURITY COMPANION, QRS-C, 2022, : 411 - 416
  • [44] A Hybrid Nonlinear Manifold Detection Approach for Software Defect Prediction
    Ghosh, Soumi
    Rana, Ajay
    Kansal, Vineet
    2018 7TH INTERNATIONAL CONFERENCE ON RELIABILITY, INFOCOM TECHNOLOGIES AND OPTIMIZATION (TRENDS AND FUTURE DIRECTIONS) (ICRITO) (ICRITO), 2018, : 453 - 459
  • [45] Imbalanced Data Processing Model for Software Defect Prediction
    Zhou, Lijuan
    Li, Ran
    Zhang, Shudong
    Wang, Hua
    WIRELESS PERSONAL COMMUNICATIONS, 2018, 102 (02) : 937 - 950
  • [46] Research on Software Defect Prediction Based on Data Mining
    Chen, Yuan
    Shen, Xiang-heng
    Du, Peng
    Ge, Bing
    2010 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND AUTOMATION ENGINEERING (ICCAE 2010), VOL 1, 2010, : 563 - 567
  • [47] Class Imbalance Evolution and Verification Latency in Just-in-Time Software Defect Prediction
    Cabral, George G.
    Minku, Leandro L.
    Shihab, Emad
    Mujahid, Suhaib
    2019 IEEE/ACM 41ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2019), 2019, : 666 - 676
  • [48] Ensemble MultiBoost Based on RIPPER Classifier for Prediction of Imbalanced Software Defect Data
    He, Haitao
    Zhang, Xu
    Wang, Qian
    Ren, Jiadong
    Liu, Jiaxin
    Zhao, Xiaolin
    Cheng, Yongqiang
    IEEE ACCESS, 2019, 7 : 110333 - 110343
  • [49] Survey of software defect prediction features
    Shaoming Qiu
    Bicong E
    Jingjie He
    Liangyu Liu
    Neural Computing and Applications, 2025, 37 (4) : 2113 - 2144
  • [50] Methodologies to handle large volumes of design and defect data for improved pre-silicon defect prediction
    Wankhede, Parnashri
    Singh, Devansh
    Sahu, Manish Kumar
    Veluru, Aruna
    Babu, Monisa Ramesh
    Miao, Chenlong
    Song, Shenghua
    Malik, Shobhit
    Madhavan, Sriram
    DTCO AND COMPUTATIONAL PATTERNING, 2022, 12052