AdaBoost Semiparametric Model Averaging Prediction for Multiple Categories

被引:27
作者
Li, Jialiang [1 ]
Lv, Jing [2 ]
Wan, Alan T. K. [3 ]
Liao, Jun [4 ]
机构
[1] Natl Univ Singapore, Dept Stat & Appl Probabil, Singapore, Singapore
[2] Southwest Univ, Sch Math & Stat, Chongqing 400715, Peoples R China
[3] City Univ Hong Kong, Dept Management Sci, Kowloon Tong, Hong Kong, Peoples R China
[4] Renmin Univ China, Sch Stat, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Boosting; Model averaging; Model misspecification; Prediction accuracy; Smoothing; Vary coefficient structure identification; GENERALIZED LINEAR-MODELS; NONCONCAVE PENALIZED LIKELIHOOD; VARYING COEFFICIENT MODELS; VARIABLE SELECTION; STATISTICAL VIEW; DIMENSION REDUCTION; EVIDENCE CONTRARY; LOGISTIC-REGRESSION; JMLR; 9; CLASSIFICATION;
D O I
10.1080/01621459.2020.1790375
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Model average techniques are very useful for model-based prediction. However, most earlier works in this field focused on parametric models and continuous responses. In this article, we study varying coefficient multinomial logistic models and propose a semiparametric model averaging prediction (SMAP) approach for multi-category outcomes. The proposed procedure does not need any artificial specification of the index variable in the adopted varying coefficient sub-model structure to forecast the response. In particular, this new SMAP method is more flexible and robust against model misspecification. To improve the practical predictive performance, we combine SMAP with the AdaBoost algorithm to obtain more accurate estimations of class probabilities and model averaging weights. We compare our proposed methods with all existing model averaging approaches and a wide range of popular classification methods via extensive simulations. An automobile classification study is included to illustrate the merits of our methodology.for this article are available online.
引用
收藏
页码:495 / 509
页数:15
相关论文
共 87 条
[41]   Least squares model averaging [J].
Hansen, Bruce E. .
ECONOMETRICA, 2007, 75 (04) :1175-1189
[42]   Jackknife model averaging [J].
Hansen, Bruce E. ;
Racine, Jeffrey S. .
JOURNAL OF ECONOMETRICS, 2012, 167 (01) :38-46
[43]  
HASTIE T, 1993, J ROY STAT SOC B MET, V55, P757
[44]   A comparison of methods for multiclass support vector machines [J].
Hsu, CW ;
Lin, CJ .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2002, 13 (02) :415-425
[45]  
Huang JHZ, 2004, STAT SINICA, V14, P763
[46]   VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS [J].
Huang, Jian ;
Horowitz, Joel L. ;
Wei, Fengrong .
ANNALS OF STATISTICS, 2010, 38 (04) :2282-2313
[47]   Semiparametric model average prediction in panel dataanalysis [J].
Huang, Tao ;
Li, Jialiang .
JOURNAL OF NONPARAMETRIC STATISTICS, 2018, 30 (01) :125-144
[48]   Semi-varying coefficient multinomial logistic regression for disease progression risk prediction [J].
Ke, Yuan ;
Fu, Bo ;
Zhang, Wenyang .
STATISTICS IN MEDICINE, 2016, 35 (26) :4764-4778
[49]   Quantile regression with varying coefficients [J].
Kim, Mi-Ok .
ANNALS OF STATISTICS, 2007, 35 (01) :92-108
[50]   Multicategory support vector machines: Theory and application to the classification of microarray data and satellite radiance data [J].
Lee, YK ;
Lin, Y ;
Wahba, G .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2004, 99 (465) :67-81