Cross-validation as the objective function for variable-selection techniques

被引:212
作者
Baumann, K [1 ]
机构
[1] Univ Wurzburg, Dept Pharm, D-97074 Wurzburg, Germany
关键词
chance correlation; cross-validation; overfitting; permutation test; variable selection;
D O I
10.1016/S0165-9936(03)00607-1
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Different methods of cross-validation are studied for their suitability to guide variable-selection algorithms to yield highly predictive models. It is shown that the commonly applied leave-one-out cross-validation has a strong tendency to overfitting, underestimates the true prediction error, and should not be used without further constraints or further validation. Alternatives to leave-one-out cross-validation and other validation methods are presented. (C) 2003 Published by Elsevier Science B.V.
引用
收藏
页码:395 / 406
页数:12
相关论文
共 51 条
[1]   Feature selection for structure-activity correlation using binary particle swarms [J].
Agrafiotis, DK ;
Cedeño, W .
JOURNAL OF MEDICINAL CHEMISTRY, 2002, 45 (05) :1098-1107
[2]   RELATIONSHIP BETWEEN VARIABLE SELECTION AND DATA AUGMENTATION AND A METHOD FOR PREDICTION [J].
ALLEN, DM .
TECHNOMETRICS, 1974, 16 (01) :125-127
[3]   Use of spectral window preprocessing for selecting near-infrared reflectance wavelengths for determination of the degree of enzymatic retting of intact flax stems [J].
Archibald, DD ;
Akin, DE .
VIBRATIONAL SPECTROSCOPY, 2000, 23 (02) :169-180
[4]   PREDICTIVE ABILITY OF REGRESSION-MODELS .2. SELECTION OF THE BEST PREDICTIVE PLS MODEL [J].
BARONI, M ;
CLEMENTI, S ;
CRUCIANI, G ;
COSTANTINO, G ;
RIGANELLI, D ;
OBERRAUCH, E .
JOURNAL OF CHEMOMETRICS, 1992, 6 (06) :347-356
[5]   GENERATING OPTIMAL LINEAR PLS ESTIMATIONS (GOLPE) - AN ADVANCED CHEMOMETRIC TOOL FOR HANDLING 3D-QSAR PROBLEMS [J].
BARONI, M ;
COSTANTINO, G ;
CRUCIANI, G ;
RIGANELLI, D ;
VALIGI, R ;
CLEMENTI, S .
QUANTITATIVE STRUCTURE-ACTIVITY RELATIONSHIPS, 1993, 12 (01) :9-20
[6]   A systematic evaluation of the benefits and hazards of variable selection in latent variable regression. Part II. Practical applications [J].
Baumann, K ;
von Korff, M ;
Albert, H .
JOURNAL OF CHEMOMETRICS, 2002, 16 (07) :351-360
[7]  
Baumann K, 2002, QUANT STRUCT-ACT REL, V21, P507, DOI 10.1002/1521-3838(200211)21:5<507::AID-QSAR507>3.0.CO
[8]  
2-L
[9]   A systematic evaluation of the benefits and hazards of variable selection in latent variable regression. Part I. Search algorithm, theory and simulations [J].
Baumann, K ;
Albert, H ;
von Korff, M .
JOURNAL OF CHEMOMETRICS, 2002, 16 (07) :339-350
[10]   An alignment-independent versatile structure descriptor for QSAR and QSPR based on the distribution of molecular features [J].
Baumann, K .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2002, 42 (01) :26-35