Recent advances and emerging challenges of feature selection in the context of big data

被引:176
作者
Bolon-Canedo, V. [1 ]
Sanchez-Marono, N. [1 ]
Alonso-Betanzos, A. [1 ]
机构
[1] Univ A Coruna, Dept Comp Sci, La Coruna 15071, Spain
关键词
Feature selection; Big data; High dimensionality; FEATURE SUBSET-SELECTION; DIMENSIONAL FEATURE-SELECTION; MICROARRAY DATA; MULTICLASS CLASSIFICATION; IMAGE ANNOTATION; FACE RECOGNITION; GENE SELECTION; INFORMATION; REGRESSION; EFFICIENT;
D O I
10.1016/j.knosys.2015.05.014
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In an era of growing data complexity and volume and the advent of big data, feature selection has a key role to play in helping reduce high-dimensionality in machine learning problems. We discuss the origins and importance of feature selection and outline recent contributions in a range of applications, from DNA microarray analysis to face recognition. Recent years have witnessed the creation of vast datasets and it seems clear that these will only continue to grow in size and number. This new big data scenario offers both opportunities and challenges to feature selection researchers, as there is a growing need for scalable yet efficient feature selection methods, given that existing methods are likely to prove inadequate. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:33 / 45
页数:13
相关论文
共 158 条
  • [1] Ananthanarayana V. S., 2000, High Performance Computing - HiPC 2000. 7th International Conference. Proceedings (Lecture Notes in Computer Science Vol.1970), P559
  • [2] [Anonymous], 2006, FEATURE EXTRACTION F
  • [3] [Anonymous], PATTERN RECOGNIT
  • [4] [Anonymous], 2002, P COMP VOL 2 HELL C
  • [5] [Anonymous], PATTERN RECOGNIT
  • [6] [Anonymous], 2003, ADV NEURAL INFORM PR
  • [7] [Anonymous], 2011, Scaling up Machine Learning: Parallel and Distributed Approaches
  • [8] [Anonymous], EFFICIENCY SCALABILI
  • [9] [Anonymous], 2006, DATA COMPLEXITY PATT
  • [10] [Anonymous], COMPUTATIONAL INTELL