A survey on online feature selection with streaming features

被引:50
作者
Hu, Xuegang [1 ]
Zhou, Peng [1 ]
Li, Peipei [1 ]
Wang, Jing [1 ]
Wu, Xindong [2 ]
机构
[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei 230009, Anhui, Peoples R China
[2] Univ Louisiana Lafayette, Lafayette, LA 70504 USA
基金
中国国家自然科学基金;
关键词
big data; feature selection; online feature selection; feature stream; MUTUAL INFORMATION; CLASSIFICATION; REGRESSION; RELEVANCE; TRACKING;
D O I
10.1007/s11704-016-5489-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the era of big data, the dimensionality of data is increasing dramatically in many domains. To deal with high dimensionality, online feature selection becomes critical in big data mining. Recently, online selection of dynamic features has received much attention. In situations where features arrive sequentially over time, we need to perform online feature selection upon feature arrivals. Meanwhile, considering grouped features, it is necessary to deal with features arriving by groups. To handle these challenges, some state-of-the-art methods for online feature selection have been proposed. In this paper, we first give a brief review of traditional feature selection approaches. Then we discuss specific problems of online feature selection with feature streams in detail. A comprehensive review of existing online feature selection methods is presented by comparing with each other. Finally, we discuss several open issues in online feature selection.
引用
收藏
页码:479 / 493
页数:15
相关论文
共 58 条
  • [1] Almuallim H, 1992, P 9 NAT C ART INT, P547
  • [2] [Anonymous], 2006, Journal of the Royal Statistical Society, Series B
  • [3] Carvalho V R, 2006, P 12 ACM SIGKDD INT
  • [4] Online selection of discriminative tracking features
    Collins, RT
    Liu, YX
    Leordeanu, M
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005, 27 (10) : 1631 - 1643
  • [5] Subkilometer Crater Discovery with Boosting and Transfer Learning
    Ding, Wei
    Stepinski, Tomasz F.
    Mu, Yang
    Bandeira, Lourenco
    Ricardo, Ricardo
    Wu, Youxi
    Lu, Zhenyu
    Cao, Tianyu
    Wu, Xindong
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (04)
  • [6] Least angle regression - Rejoinder
    Efron, B
    Hastie, T
    Johnstone, I
    Tibshirani, R
    [J]. ANNALS OF STATISTICS, 2004, 32 (02) : 494 - 499
  • [7] Greedy column subset selection for large-scale data sets
    Farahat, Ahmed K.
    Elgohary, Ahmed
    Ghodsi, Ali
    Kamel, Mohamed S.
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 45 (01) : 1 - 34
  • [8] Franck M., 2015, MANY PHOTOS ARE UPLO
  • [9] Friedman J, 1910, MATHEMATICS
  • [10] Gu Q., 2012, P 27 C UNC ART INT U