A Distance-Based Feature Selection Approach for Software Anomaly Detection

被引:0
作者
Akhter, Suravi [1 ]
Sajeeda, Afia [2 ]
Kabir, Ahmedul [2 ]
机构
[1] Univ Liberal Arts Bangladesh, Dept Comp Sci & Engn, Dhaka, Bangladesh
[2] Univ Dhaka, Inst Informat Technol, Dhaka, Bangladesh
来源
PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON EVALUATION OF NOVEL APPROACHES TO SOFTWARE ENGINEERING, ENASE 2023 | 2023年
关键词
Software Defect Prediction; Bug Severity Classification; Feature Selection; MUTUAL INFORMATION; DEFECT PREDICTION; ALGORITHMS; RELEVANCE;
D O I
10.5220/0011859500003464
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
An anomaly of software refers to a bug or defect or anything that causes the software to deviate from its normal behavior. Anomalies should be identified properly to make more stable and error-free software systems. There are various machine learning-based approaches for anomaly detection. For proper anomaly detection, feature selection is a necessary step that helps to remove noisy and irrelevant features and thus reduces the dimensionality of the given feature vector. Most of the existing feature selection methods rank the given features using different selection criteria, such as mutual information (MI) and distance. Furthermore, these, especially MI-based methods fail to capture feature interaction during the ranking/selection process in case of larger feature dimensions which degrades the discrimination ability of the selected feature set. Moreover, it becomes problematic to make a decision about the appropriate number of features from the ranked feature set to get acceptable performance. To solve these problems, in this paper we propose anomaly detection for software data (ADSD), which is a feature subset selection method and is able to capture interactive and relevant feature subsets. Experimental results on 15 benchmark software defect datasets and two bug severity classification datasets demonstrate the performance of ADSD in comparison to four state-of-the-art methods.
引用
收藏
页码:149 / 157
页数:9
相关论文
共 35 条
[1]  
Agarwal S, 2014, PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS AND COMPUTER NETWORKS (ISCON), P128, DOI 10.1109/ICISCON.2014.6965232
[2]  
Akhter Suravi, 2021, Computational Science - ICCS 2021. 21st International Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12742), P278, DOI 10.1007/978-3-030-77961-0_24
[3]  
Akintola A. G., 2018, FUOYE Journal of Engineering and Technology, V3, P1, DOI DOI 10.1007/s00521-016-2765-y
[4]   Leveraging legacy system dollars for e-business [J].
Erlikh, Len .
IT Professional, 2000, 2 (03) :17-23
[5]   Feature redundancy term variation for mutual information-based feature selection [J].
Gao, Wanfu ;
Hu, Liang ;
Zhang, Ping .
APPLIED INTELLIGENCE, 2020, 50 (04) :1272-1288
[6]  
Goh L., 2004, P 2 C ASIA PACIFIC B, V29, P161
[7]   Spatially Uniform ReliefF (SURF) for computationally-efficient filtering of gene-gene interactions [J].
Greene, Casey S. ;
Penrod, Nadia M. ;
Kiralis, Jeff ;
Moore, Jason H. .
BIODATA MINING, 2009, 2
[8]  
Hossain Khan M. Saddam, 2019, 2019 International Conference of Artificial Intelligence and Information Technology (ICAIIT). Proceedings, P506, DOI 10.1109/ICAIIT.2019.8834633
[9]   Software Defect Prediction using Feature Selection and Random Forest Algorithm [J].
Ibrahim, Dyana Rashid ;
Ghnemat, Rawan ;
Hudaib, Amjad .
2017 INTERNATIONAL CONFERENCE ON NEW TRENDS IN COMPUTING SCIENCES (ICTCS), 2017, :252-257
[10]  
Igor K., 1994, EUR C MACH LEARN