Implementation of FAST Clustering-Based Feature Subset Selection Algorithm for High-Dimensional Data

被引：0

作者：

Shilu, Smit ^{[1
]}

Sheth, Kushal ^{[1
]}

Mehul, Ekata ^{[2
]}

机构：

[1] Charusat Univ, EiTRA EInfochips Training, Changa, India

[2] Res Acad, Ahmadabad, Gujarat, India

来源：

PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ICT FOR SUSTAINABLE DEVELOPMENT ICT4SD 2015, VOL 2 | 2016年 / 409卷

关键词：

Feature subset selection; Feature clustering; Filter method; Kruskal's algorithm; Graph-based clustering;

D O I：

10.1007/978-981-10-0135-2_19

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In feature selection, we are concerned with finding out those features that produces result similar to those of the original entire set of features. We concern ourselves with efficiency and effectiveness while evaluating Feature selection algorithms. Efficiency deals with the time required to find a subset of features and effectiveness, with the quality of subset of features. On these criteria, a FAST clustering-based feature selection algorithm (FAST) has been proposed and experimentally evaluated and implemented in this paper. The dimensionality reduction of data is the most important feature of FAST. First, we use graph-theoretic clustering method to divide features into clusters. Next, we form a subset of features by selecting the feature which is most representative and strongly related to the target classes. Due to features in different clusters being relatively independent; the clustering-based strategy of FAST has a high probability of providing us with a subset of features which are both useful and independent. Efficiency of FAST is ensured by using the concept of minimum spanning tree (MST) along with kruskal's algorithm.

引用

页码：203 / 213

页数：11