AdaBoost Semiparametric Model Averaging Prediction for Multiple Categories

被引:27
作者
Li, Jialiang [1 ]
Lv, Jing [2 ]
Wan, Alan T. K. [3 ]
Liao, Jun [4 ]
机构
[1] Natl Univ Singapore, Dept Stat & Appl Probabil, Singapore, Singapore
[2] Southwest Univ, Sch Math & Stat, Chongqing 400715, Peoples R China
[3] City Univ Hong Kong, Dept Management Sci, Kowloon Tong, Hong Kong, Peoples R China
[4] Renmin Univ China, Sch Stat, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Boosting; Model averaging; Model misspecification; Prediction accuracy; Smoothing; Vary coefficient structure identification; GENERALIZED LINEAR-MODELS; NONCONCAVE PENALIZED LIKELIHOOD; VARYING COEFFICIENT MODELS; VARIABLE SELECTION; STATISTICAL VIEW; DIMENSION REDUCTION; EVIDENCE CONTRARY; LOGISTIC-REGRESSION; JMLR; 9; CLASSIFICATION;
D O I
10.1080/01621459.2020.1790375
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Model average techniques are very useful for model-based prediction. However, most earlier works in this field focused on parametric models and continuous responses. In this article, we study varying coefficient multinomial logistic models and propose a semiparametric model averaging prediction (SMAP) approach for multi-category outcomes. The proposed procedure does not need any artificial specification of the index variable in the adopted varying coefficient sub-model structure to forecast the response. In particular, this new SMAP method is more flexible and robust against model misspecification. To improve the practical predictive performance, we combine SMAP with the AdaBoost algorithm to obtain more accurate estimations of class probabilities and model averaging weights. We compare our proposed methods with all existing model averaging approaches and a wide range of popular classification methods via extensive simulations. An automobile classification study is included to illustrate the merits of our methodology.for this article are available online.
引用
收藏
页码:495 / 509
页数:15
相关论文
共 87 条
[1]  
AKAIKE H, 1979, BIOMETRIKA, V66, P237, DOI 10.1093/biomet/66.2.237
[2]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[3]   A WEIGHT-RELAXED MODEL AVERAGING APPROACH FOR HIGH-DIMENSIONAL GENERALIZED LINEAR MODELS [J].
Ando, Tomohiro ;
Li, Ker-Chau .
ANNALS OF STATISTICS, 2017, 45 (06) :2654-2679
[4]   A Model-Averaging Approach for High-Dimensional Regression [J].
Ando, Tomohiro ;
Li, Ker-Chau .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2014, 109 (505) :254-265
[5]  
[Anonymous], 1989, Applied Logistic Regression
[6]   Conditional Sure Independence Screening [J].
Barut, Emre ;
Fan, Jianqing ;
Verhasselt, Anneleen .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2016, 111 (515) :1266-1277
[7]  
Bennett KP, 2008, J MACH LEARN RES, V9, P157
[8]  
Biau G, 2012, J MACH LEARN RES, V13, P1063
[9]   Regularized estimation of large covariance matrices [J].
Bickel, Peter J. ;
Levina, Elizaveta .
ANNALS OF STATISTICS, 2008, 36 (01) :199-227
[10]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32