Boosting partial least squares

被引:70
作者
Zhang, MH
Xu, QS
Massart, DL
机构
[1] Free Univ Brussels, Inst Pharmaceut, Dept Pharmaceut & Biomed Anal, ChemoAC, B-1090 Brussels, Belgium
[2] Cent S Univ, Sch Math Sci & Comp Technol, Changsha 410083, Peoples R China
关键词
D O I
10.1021/ac048561m
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
A difficulty when applying partial least squares (PLS) in multivariate calibration is that overfitting may occur. This study proposes a novel approach by combining PLS and boosting. The latter is said to be resistant to overfitting. The proposed method, called boosting PLS (BPLS), combines a set of shrunken PLS models, each with only one PLS component. The method is iterative: the models are constructed on the basis of the residuals of the responses that are not explained by previous models. Unlike classical PLS, BPLS does not need to select an adequate number of PLS components to be included in the model. On the other hand, two parameters must be determined: the shrinkage value and the iteration number. Criteria are proposed for these two purposes. BPLS was applied to seven real data sets, and the results demonstrate that it is more resistant than classical PLS to overfitting without loosing accuracy.
引用
收藏
页码:1423 / 1431
页数:9
相关论文
共 35 条
[1]  
[Anonymous], 1988, Journal of Chemometrics
[2]  
[Anonymous], 1966, Multivariate Analysis
[3]  
BETZIN J, 2003, 3 INT S PLS REL METH, P261
[4]   Improving nonparametric regression methods by bagging and boosting [J].
Borra, S ;
Di Ciaccio, A .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2002, 38 (04) :407-420
[5]   Boosting with the L2 loss:: Regression and classification [J].
Bühlmann, P ;
Yu, B .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2003, 98 (462) :324-339
[6]  
COPAS JB, 1983, J R STAT SOC B, V45, P311
[7]  
CORBISIER S, VALORISATION BASES D
[8]   Classification and regression trees-studies of HIV reverse transcriptase inhibitors [J].
Daszykowski, M ;
Walczak, B ;
Xu, QS ;
Daeyaert, F ;
de Jonge, MR ;
Heeres, J ;
Koymans, LMH ;
Lewi, PJ ;
Vinkers, HM ;
Janssen, PA ;
Massart, DL .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (02) :716-726
[9]  
Drucker H., 1997, ICML 97, P107
[10]   Boosting methods for regression [J].
Duffy, N ;
Helmbold, D .
MACHINE LEARNING, 2002, 47 (2-3) :153-200