A penalized criterion for variable selection in classification

被引:8
作者
Mary-Huard, Tristan [1 ]
Robin, Stephane [1 ]
Daudin, Jean-Jacques [1 ]
机构
[1] INRA, INAPG, Dept OMIP, Dept MIA, Paris 05, France
关键词
statistical learning; variable selection; oracle inequality; penalized criterion;
D O I
10.1016/j.jmva.2006.06.003
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this paper, the problem of variable selection in classification is considered. On the basis of recent developments in model selection theory, we provide a criterion based on penalized empirical risk, where the penalization explicitly takes into account the number of variables of the considered models. Moreover, we give an oracle-type inequality that non-asymptotically guarantees the performance of the resulting classification rule. We discuss the optimality of the proposed criterion and present an application of the main result to backward and forward selection procedures. (c) 2006 Elsevier Inc. All rights reserved.
引用
收藏
页码:695 / 705
页数:11
相关论文
共 34 条
[1]  
[Anonymous], 2001, Journal of the European Mathematical Society, DOI DOI 10.1007/S100970100031
[2]  
[Anonymous], 1989, SELECTED PAPERS C R
[3]  
BARTLETT PL, 2000, 508 U POMP FABR DEP
[4]  
BIRGE L, 2002, TECHNICAL REPORT PUB, V721
[5]  
BIRGE L, 2001, TECHNICAL REPORT PUB, V647
[6]  
BLANCHARD G, UNPUB STAT PERFORMAN
[7]  
Boser B. E., 1992, Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, P144, DOI 10.1145/130385.130401
[8]   Knowledge-based analysis of microarray gene expression data by using support vector machines [J].
Brown, MPS ;
Grundy, WN ;
Lin, D ;
Cristianini, N ;
Sugnet, CW ;
Furey, TS ;
Ares, M ;
Haussler, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) :262-267
[9]  
CASTELLAN G, 1999, MODIFIED AKAIKE CRIT
[10]   COMPARISON OF STOPPING RULES IN FORWARD STEPWISE DISCRIMINANT-ANALYSIS [J].
COSTANZA, MC ;
AFIFI, AA .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1979, 74 (368) :777-785