Variable selection for ultra-high-dimensional logistic models

被引:0
作者
Du, Pang [1 ]
Wu, Pan [2 ]
Liang, Hua [3 ]
机构
[1] Virginia Polytech Inst & State Univ, Dept Stat, Blacksburg, VA 24061 USA
[2] Univ Rochester, Dept Biostat & Computat Biol, Rochester, NY 14642 USA
[3] George Washington Univ, Dept Stat, Washington, DC 20052 USA
来源
PERSPECTIVES ON BIG DATA ANALYSIS: METHODOLOGIES AND APPLICATIONS | 2014年 / 622卷
关键词
Concave convex procedure; coordinate ascent; coordinate descent; LASSO; local linear approximation; local quadratic approximation; oracle property; penalized variable selection; SCAD; GENERALIZED LINEAR-MODELS; NONCONCAVE PENALIZED LIKELIHOOD; DIVERGING NUMBER; ADAPTIVE LASSO; REGULARIZATION; CONSISTENCY;
D O I
10.1090/conm/622/12436
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We propose a variable selection procedure through the optimization of a nonconcave penalized likelihood for logistic regression models with the dimension of covariates p diverging in an exponential rate of n. We first establish the oracle property of the procedure under such ultra-high-dimensional setting. Our optimization algorithm combines some recent developments, including the concave convex procedure and the coordinate descent algorithm, in solving regularization problems. Through extensive simulations, we show the promise of the proposed procedure in various high-dimensional logistic regression settings. An application to gene expression data from a breast cancer study illustrates the use of the method.
引用
收藏
页码:141 / 158
页数:18
相关论文
共 37 条
[21]   Variable selection in semiparametric regression modeling [J].
Li, Runze ;
Liang, Hua .
ANNALS OF STATISTICS, 2008, 36 (01) :261-286
[22]   Variable Selection for Partially Linear Models With Measurement Errors [J].
Liang, Hua ;
Li, Runze .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2009, 104 (485) :234-248
[23]  
Lindsey J. K., 1997, SPRINGER TEXTS STAT, V13
[24]   A UNIFIED APPROACH TO MODEL SELECTION AND SPARSE RECOVERY USING REGULARIZED LEAST SQUARES [J].
Lv, Jinchi ;
Fan, Yingying .
ANNALS OF STATISTICS, 2009, 37 (6A) :3498-3528
[25]   GENERALIZED LINEAR MODELS [J].
NELDER, JA ;
WEDDERBURN, RW .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-GENERAL, 1972, 135 (03) :370-+
[26]   Piecewise linear regularized solution paths [J].
Rosset, Saharon ;
Zhu, Ji .
ANNALS OF STATISTICS, 2007, 35 (03) :1012-1030
[28]   High-dimensional generalized linear models and the lasso [J].
van de Geer, Sara A. .
ANNALS OF STATISTICS, 2008, 36 (02) :614-645
[29]   Gene expression profiling predicts clinical outcome of breast cancer [J].
van't Veer, LJ ;
Dai, HY ;
van de Vijver, MJ ;
He, YDD ;
Hart, AAM ;
Mao, M ;
Peterse, HL ;
van der Kooy, K ;
Marton, MJ ;
Witteveen, AT ;
Schreiber, GJ ;
Kerkhoven, RM ;
Roberts, C ;
Linsley, PS ;
Bernards, R ;
Friend, SH .
NATURE, 2002, 415 (6871) :530-536
[30]   GEE ANALYSIS OF CLUSTERED BINARY DATA WITH DIVERGING NUMBER OF COVARIATES [J].
Wang, Lan .
ANNALS OF STATISTICS, 2011, 39 (01) :389-417