Feature Selection using Mutual Information for High-dimensional Data Sets

被引:0
作者
Nagpal, Arpita [1 ]
Gaur, Deepti [1 ]
Gaur, Seema [2 ]
机构
[1] ITM Univ, Dept Comp Sci, Gurgaon, India
[2] Banasthali Univ, Banasthali, Rajasthan, India
来源
SOUVENIR OF THE 2014 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC) | 2014年
关键词
Correlation; feature selection; minimum spanning tree; data set;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
To reduce the dimensionality of dataset, redundant and irrelevant features need to be segregated from multidimensional dataset. To remove these features, one of the feature selection techniques needs to be used. Here, a feature selection technique to remove irrelevant features has been used. Correlation measures based on the concept of mutual information has been adopted to calculate the degree of association between features. In this paper authors are proposing a new algorithm to segregate features from high dimensional data by visualizing relevant features in the form of graph as a dataset.
引用
收藏
页码:45 / 49
页数:5
相关论文
共 13 条
  • [1] [Anonymous], P 12 INT C MACH LEAR
  • [2] [Anonymous], 1994, MACHINE LEARNING P 1, DOI DOI 10.1016/B978-1-55860-335-6.50023-4
  • [3] Brown G, 2012, J MACH LEARN RES, V13, P27
  • [4] Cover, 1991, ELEMENTS INFORM THEO
  • [5] Das S., 2001, P 18 INT C MACHINE L, P74, DOI DOI 10.5555/645530.658297
  • [6] Grygorash Oleksandr., P 18 IEEE INT C TOOL
  • [7] Hall M.A., 1998, FEATURE SELECTION MA
  • [8] Huang, 2006, 5 IEEE INT C
  • [9] Supervised feature selection by clustering using conditional mutual information-based distances
    Martinez Sotoca, Jose
    Pla, Filiberto
    [J]. PATTERN RECOGNITION, 2010, 43 (06) : 2068 - 2081
  • [10] A Fast Clustering-Based Feature Subset Selection Algorithm for High-Dimensional Data
    Song, Qinbao
    Ni, Jingjie
    Wang, Guangtao
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (01) : 1 - 14