A model-based multithreshold method for subgroup identification

被引:21
作者
Wang, Jingli [1 ]
Li, Jialiang [1 ,2 ,3 ]
Li, Yaguang [4 ]
Wong, Weng Kee [5 ]
机构
[1] Natl Univ Singapore, Dept Stat & Appl Probabil, Singapore, Singapore
[2] Duke Univ, NUS Grad Med Sch, Singapore, Singapore
[3] Singapore Eye Res Inst, Singapore, Singapore
[4] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[5] Univ Calif Los Angeles, Fielding Sch Publ Hlth, Dept Biostat, Los Angeles, CA 90095 USA
基金
美国国家卫生研究院;
关键词
change point; factor analysis; PCA; personalized medicine; Scleroderma; subgroup identification; UNBIASED VARIABLE SELECTION; DIVERGING NUMBER; EXPRESSION; TRIAL;
D O I
10.1002/sim.8136
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Thresholding variable plays a crucial role in subgroup identification for personalized medicine. Most existing partitioning methods split the sample based on one predictor variable. In this paper, we consider setting the splitting rule from a combination of multivariate predictors, such as the latent factors, principle components, and weighted sum of predictors. Such a subgrouping method may lead to more meaningful partitioning of the population than using a single variable. In addition, our method is based on a change point regression model and thus yields straight forward model-based prediction results. After choosing a particular thresholding variable form, we apply a two-stage multiple change point detection method to determine the subgroups and estimate the regression parameters. We show that our approach can produce two or more subgroups from the multiple change points and identify the true grouping with high probability. In addition, our estimation results enjoy oracle properties. We design a simulation study to compare performances of our proposed and existing methods and apply them to analyze data sets from a Scleroderma trial and a breast cancer study.
引用
收藏
页码:2605 / 2631
页数:27
相关论文
共 60 条
[1]  
Anderson T. W., 2003, An introduction to multivariate statistical analysis, V3rd
[2]  
[Anonymous], ARXIV171105386
[3]   Inferential theory for factor models of large dimensions. [J].
Bai, J .
ECONOMETRICA, 2003, 71 (01) :135-171
[4]   Common breaks in means and variances for panel data [J].
Bai, Jushan .
JOURNAL OF ECONOMETRICS, 2010, 157 (01) :78-92
[5]   Prediction by supervised principal components [J].
Bair, E ;
Hastie, T ;
Paul, D ;
Tibshirani, R .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (473) :119-137
[6]   Supervised principal component analysis: Visualization, classification and regression on subspaces and submanifolds [J].
Barshan, Elnaz ;
Ghodsi, Ali ;
Azimifar, Zohreh ;
Jahromi, Mansoor Zolghadri .
PATTERN RECOGNITION, 2011, 44 (07) :1357-1371
[7]   SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[8]   A PRIM approach to predictive-signature development for patient stratification [J].
Chen, Gong ;
Zhong, Hua ;
Belousov, Anton ;
Devanarayan, Viswanath .
STATISTICS IN MEDICINE, 2015, 34 (02) :317-342
[9]   A general statistical framework for subgroup identification and comparative treatment scoring [J].
Chen, Shuai ;
Tian, Lu ;
Cai, Tianxi ;
Yu, Menggang .
BIOMETRICS, 2017, 73 (04) :1199-1209
[10]   Forward Variable Selection for Sparse Ultra-High Dimensional Varying Coefficient Models [J].
Cheng, Ming-Yen ;
Honda, Toshio ;
Zhang, Jin-Ting .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2016, 111 (515) :1209-1221