An efficient method for feature selection in linear regression based on an extended Akaike's information criterion

被引:1
|
作者
Vetrov, D. P. [1 ]
Kropotov, D. A. [2 ]
Ptashko, N. O. [1 ]
机构
[1] Moscow MV Lomonosov State Univ, Fac Computat Math & Cybernet, Moscow 119992, Russia
[2] Russian Acad Sci, Dorodnicyn Comp Ctr, Moscow 119333, Russia
基金
俄罗斯基础研究基金会;
关键词
pattern recognition; linear regression; feature selection; Akaike's information criterion;
D O I
10.1134/S096554250911013X
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
A method for feature selection in linear regression based on an extension of Akaike's information criterion is proposed. The use of classical Akaike's information criterion (AIC) for feature selection assumes the exhaustive search through all the subsets of features, which has unreasonably high computational and time cost. A new information criterion is proposed that is a continuous extension of AIC. As a result, the feature selection problem is reduced to a smooth optimization problem. An efficient procedure for solving this problem is derived. Experiments show that the proposed method enables one to efficiently select features in linear regression. In the experiments, the proposed procedure is compared with the relevance vector machine, which is a feature selection method based on Bayesian approach. It is shown that both procedures yield similar results. The main distinction of the proposed method is that certain regularization coefficients are identical zeros. This makes it possible to avoid the underfitting effect, which is a characteristic feature of the relevance vector machine. A special case (the so-called nondiagonal regularization) is considered in which both methods are identical.
引用
收藏
页码:1972 / 1985
页数:14
相关论文
共 50 条
  • [41] Information-Based Optimal Subdata Selection for Big Data Linear Regression
    Wang, HaiYing
    Yang, Min
    Stufken, John
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2019, 114 (525) : 393 - 405
  • [42] A feature selection method based on Shapley values robust for concept shift in regression
    Sebastián C.
    González-Guillén C.E.
    Neural Computing and Applications, 2024, 36 (23) : 14575 - 14597
  • [43] ASYMPTOTIC THEORY OF GENERALIZED INFORMATION CRITERION FOR GEOSTATISTICAL REGRESSION MODEL SELECTION
    Chang, Chih-Hao
    Huang, Hsin-Cheng
    Ing, Ching-Kang
    ANNALS OF STATISTICS, 2014, 42 (06) : 2441 - 2468
  • [44] Meteorological Feature Selection Method Based on Information Value and Maximum Correlation
    Zhang, Di
    Zhang, Yi
    Zhou, Junlin
    Yan, Pan
    Yang, Xin
    Fang, Yuke
    2018 CHINESE AUTOMATION CONGRESS (CAC), 2018, : 3159 - 3164
  • [45] Feature Selection Method Based on Mutual Information and Support Vector Machine
    Liu, Gang
    Yang, Chunlei
    Liu, Sen
    Xiao, Chunbao
    Song, Bin
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2021, 35 (06)
  • [46] Applying a Mutual Information Theory Based Feature Selection Method to a Classifier
    Lee, Sun-Mi
    HEALTHCARE INFORMATICS RESEARCH, 2005, 11 (03) : 247 - 253
  • [47] Intrusion detection method based on information gain and ReliefF feature selection
    Zhang, Yong
    Ren, Xuezhen
    Zhang, Jie
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [48] Efficient approximate top-k mutual information based feature selection
    Salam, Md Abdus
    Roy, Senjuti Basu
    Das, Gautam
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2023, 61 (01) : 191 - 223
  • [49] Feature gene selection method based on logistic and correlation information entropy
    Xu, Jiucheng
    Li, Tao
    Sun, Lin
    BIO-MEDICAL MATERIALS AND ENGINEERING, 2015, 26 : S1953 - S1959
  • [50] Feature Selection in Regression Tasks Using Conditional Mutual Information
    Latorre Carmona, Pedro
    Sotoca, Jose M.
    Pla, Filiberto
    Phoa, Frederick K. H.
    Dias, Jose Bioucas
    PATTERN RECOGNITION AND IMAGE ANALYSIS: 5TH IBERIAN CONFERENCE, IBPRIA 2011, 2011, 6669 : 224 - 231