Semi-greedy heuristics for feature selection with test cost constraints

被引:34
|
作者
Min F. [1 ]
Xu J. [1 ]
机构
[1] School of Computer Science, Southwest Petroleum University, Chengdu
基金
中国国家自然科学基金;
关键词
Feature selection; Granular computing; Semi-greedy; Test cost constraint;
D O I
10.1007/s41066-016-0017-2
中图分类号
学科分类号
摘要
In real-world applications, the test cost of data collection should not exceed a given budget. The problem of selecting an informative feature subset under this budget is referred to as feature selection with test cost constraints. Greedy heuristics are a natural and efficient method for this kind of combinatorial optimization problem. However, the recursive selection of locally optimal choices means that the global optimum is often missed. In this paper, we present a three-step semi-greedy heuristic method that directly forms a population of candidate solutions to obtain better results. In the first step, we design the heuristic function. The second step involves the random selection of a feature from the current best k features at each iteration. This is the major difference from conventional greedy heuristics. In the third step, we obtain p candidate solutions and select the best one. Through a series of experiments on four datasets, we compare our algorithm with a classic greedy heuristic approach and an information gain-based λ-weighted greedy heuristic method. The results show that the new approach is more likely to obtain optimal solutions. © 2016, Springer International Publishing Switzerland.
引用
收藏
页码:199 / 211
页数:12
相关论文
共 50 条
  • [21] Feature Selection with Multi-Cost Constraint
    Li, Jingkuan
    Zhao, Hong
    Zhu, William
    JOURNAL OF INTERNET TECHNOLOGY, 2016, 17 (05): : 981 - 991
  • [22] Feature selection based on meta-heuristics for biomedicine
    Wang, Ling
    Ni, Haoqi
    Yang, Ruixin
    Pappu, Vijay
    Fenn, Michael B.
    Pardalos, Panos M.
    OPTIMIZATION METHODS & SOFTWARE, 2014, 29 (04) : 703 - 719
  • [23] Semi-supervised Feature Selection Based on Cost-Sensitive and Structural Information
    Tao, Yiling
    Lu, Guangquan
    Ma, Chaoqun
    Su, Zidong
    Hu, Zehui
    DATABASES THEORY AND APPLICATIONS (ADC 2021), 2021, 12610 : 23 - 36
  • [24] Fast randomized algorithm with restart strategy for minimal test cost feature selection
    Li, Jingkuan
    Zhao, Hong
    Zhu, William
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2015, 6 (03) : 435 - 442
  • [25] Fast randomized algorithm with restart strategy for minimal test cost feature selection
    Jingkuan Li
    Hong Zhao
    William Zhu
    International Journal of Machine Learning and Cybernetics, 2015, 6 : 435 - 442
  • [26] A greedy feature selection algorithm for Big Data of high dimensionality
    Ioannis Tsamardinos
    Giorgos Borboudakis
    Pavlos Katsogridakis
    Polyvios Pratikakis
    Vassilis Christophides
    Machine Learning, 2019, 108 : 149 - 202
  • [27] Classifier-dependent feature selection via greedy methods
    Camattari, Fabiana
    Guastavino, Sabrina
    Marchetti, Francesco
    Piana, Michele
    Perracchione, Emma
    STATISTICS AND COMPUTING, 2024, 34 (05)
  • [28] Feature Selection Under Fairness Constraints
    Dorleon, Ginel
    Megdiche, Imen
    Bricon-Souf, Nathalie
    Teste, Olivier
    37TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2022, : 1125 - 1127
  • [29] On the Consistency of Feature Selection using Greedy Least Squares Regression
    Zhang, Tong
    JOURNAL OF MACHINE LEARNING RESEARCH, 2009, 10 : 555 - 568
  • [30] A greedy feature selection algorithm for Big Data of high dimensionality
    Tsamardinos, Ioannis
    Borboudakis, Giorgos
    Katsogridakis, Pavlos
    Pratikakis, Polyvios
    Christophides, Vassilis
    MACHINE LEARNING, 2019, 108 (02) : 149 - 202