A penalized criterion for variable selection in classification

被引:8
作者
Mary-Huard, Tristan [1 ]
Robin, Stephane [1 ]
Daudin, Jean-Jacques [1 ]
机构
[1] INRA, INAPG, Dept OMIP, Dept MIA, Paris 05, France
关键词
statistical learning; variable selection; oracle inequality; penalized criterion;
D O I
10.1016/j.jmva.2006.06.003
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this paper, the problem of variable selection in classification is considered. On the basis of recent developments in model selection theory, we provide a criterion based on penalized empirical risk, where the penalization explicitly takes into account the number of variables of the considered models. Moreover, we give an oracle-type inequality that non-asymptotically guarantees the performance of the resulting classification rule. We discuss the optimality of the proposed criterion and present an application of the main result to backward and forward selection procedures. (c) 2006 Elsevier Inc. All rights reserved.
引用
收藏
页码:695 / 705
页数:11
相关论文
共 34 条
[21]   A Bayesian approach to joint feature selection and classifier design [J].
Krishnapuram, B ;
Hartemink, AJ ;
Carin, L ;
Figueiredo, MAT .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (09) :1105-1111
[22]  
KRISHNAPURAM B, 2004, GENE EXPRESSION ANAL
[23]  
LEBARBIER E, 2002, THESIS U PARIS 11
[24]   Concept learning using complexity regularization [J].
Lugosi, G ;
Zeger, K .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1996, 42 (01) :48-54
[25]  
Massart P., 2000, P ANN FACULTE SCI TO, V9, P245
[26]  
McHenry C. E., 1978, J Royal Stat Soc. C (App Stat), V27, P291
[27]  
Rakotomamonjy A., 2003, Journal of Machine Learning Research, V3, P1357, DOI 10.1162/153244303322753706
[28]  
RAO CR, 1970, ESSAYS PROBABILITY S
[29]  
Seber G. A. F., 1984, Multivariate observations, DOI DOI 10.1002/9780470316641
[30]  
Smola A. J., 2002, Learning With Kernels: Support Vector Machines, Regularization, Optimization, and Beyond