Challenges of Feature Selection for Big Data Analytics

被引:144
作者
Li J. [1 ]
Liu H. [1 ]
机构
[1] Li, Jundong
[2] Liu, Huan
来源
| 1600年 / Institute of Electrical and Electronics Engineers Inc., United States卷 / 32期
基金
美国国家科学基金会;
关键词
big data; feature selection; intelligent systems; repository;
D O I
10.1109/MIS.2017.38
中图分类号
学科分类号
摘要
We're surrounded by huge amounts of large-scale high-dimensional data, but learning tasks require reduced data dimensionality. Feature selection has shown its effectiveness in many applications by building simpler and more comprehensive models, improving learning performance, and preparing clean, understandable data. Some unique characteristics of big data such as data velocity and data variety have presented challenges to the feature selection problem. In this article, the authors envision these challenges for big data analytics. To facilitate and promote feature selection research, they present an open source feature selection repository (scikit-feature) of popular algorithms. © 2017 IEEE.
引用
收藏
页码:9 / 15
页数:6
相关论文
共 15 条
[1]  
Li J., Et al., Feature Selection: A Data Perspective, (2016)
[2]  
Liu H., Motoda H., Computational Methods of Feature Selection, (2007)
[3]  
Ye J., Liu J., Sparse methods for biomedical data, ACM SIGKDD Explorations Newsletter, 14, 1, pp. 4-15, (2012)
[4]  
McAuley J., Et al., Subband correlation and robust speech rec-ognition, IEEE Trans. Speech and Audio Processing, 13, 5, pp. 956-964, (2005)
[5]  
Ma S., Song X., Huang J., Super-vised group lasso with applications to microarray data analysis, BMC Bioinformatics, 8, 1, (2007)
[6]  
Jenatton R., Audibert J.-Y., Bach F., Structured variable selection with sparsity-inducing norms, J. Machine Learning Research, 12, pp. 2777-2824, (2011)
[7]  
Fellbaum C., WordNet, (1998)
[8]  
Li J., Et al., Robust unsupervised feature selection on networked data, Proc. 2016 SIAM Int'l Conf. Data Mining, pp. 387-395, (2016)
[9]  
Tang J., Liu H., Unsupervised feature selection for linked social media data, Proc. 18th ACM SIG-KDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 904-912, (2012)
[10]  
Zhao Z.A., Liu H., Spectral Feature Selection for Data Mining, (2011)