Hybrid Global Sensitivity Analysis Based Optimal Attribute Selection Using Classification Techniques by Machine Learning Algorithm

被引:0
作者
G. Saranya
A. Pravin
机构
[1] Sathyabama Institute of Science and Technology,Department of Computer Science and Engineering, School of Computing
[2] SRM Institute of Science and Technology,Department of Networking and Communications, School of Computing
关键词
Classification; Genetic algorithm; Particle swarm; Global sensitivity analysis; Random forest; Filter selection;
D O I
暂无
中图分类号
学科分类号
摘要
Feature selection is a major process in data mining and classification process. It improves the classifier performance and reduces the computation time by removing the redundant and irrelevant information from the dataset. Initially, all variables from 10 to 100 were processed for classification process. It consumes more time and the efficiency of classifier is minimum. The best features can be selected from the following three methods: wrapper selection, filter and embedded method. In wrapper method, the feature selection method is based on two methods namely sequential searching method and heuristic approach. In sequential searching method, the subset of features is determined by processing from empty set. In heuristic approach, the feature is determined by subset of features by achieving the objective function. Some of the heuristic approaches are optimization algorithms like genetic algorithm, particle swarm and so on. Only few works were suggested the random forest classifier due to the hierarchical arrangements of data. Among the three filter techniques, the wrapper-based selection technique able to produce high accuracy. It is due to the attribute selection method. This problem is overcome with the proposed global sensitivity analysis approach. In this, an optimized filter technique is proposed for the feature selection for classification process. Here, the selection of attributes for the classification is performed in two stages. In the first stage, the filter selection is approach is based on the global sensitivity analysis. In the second stage, the dominant attribute from the first stage is determined though the wrapper approach using particle swarm optimization. Due to this multistage feature selection, the proposed approach can be applied to any type of machine learningapplication. The proposed particle swarm optimization based global sensitivity analysis (PSO-GSA) is performed on the Cleveland dataset using MATLAB. Its performance is evaluated in terms of accuracy, sensitivity and specificity and it is compared with the wrapper selection method. The proposed PSO-GSA able to outperform the wrapper selection by high accuracy of 90% and sensitivity of 94.74%. The computational time for the proposed GSA based classification of heart disease using random forest classifier is 0.7689 s, which is less when it iscompared with the computational time of classifiers with bagging and boosting technique.
引用
收藏
页码:2305 / 2324
页数:19
相关论文
共 93 条
[41]  
Shahmoradi L(undefined)undefined undefined undefined undefined-undefined
[42]  
Murugan S(undefined)undefined undefined undefined undefined-undefined
[43]  
Jeyalaksshmi S(undefined)undefined undefined undefined undefined-undefined
[44]  
Mahalakshmi B(undefined)undefined undefined undefined undefined-undefined
[45]  
Suseendran G(undefined)undefined undefined undefined undefined-undefined
[46]  
Jabeen TN(undefined)undefined undefined undefined undefined-undefined
[47]  
Manikandan R(undefined)undefined undefined undefined undefined-undefined
[48]  
Shah SMS(undefined)undefined undefined undefined undefined-undefined
[49]  
Shah FA(undefined)undefined undefined undefined undefined-undefined
[50]  
Hussain SA(undefined)undefined undefined undefined undefined-undefined