On assessing binary regression models based on ungrouped data

被引:2
作者
Lu, Chunling [1 ,2 ]
Yang, Yuhong [3 ]
机构
[1] Harvard Univ, Brigham & Womens Hosp, Div Global Hlth, Boston, MA 02115 USA
[2] Harvard Univ, Dept Global Hlth & Social Med, Boston, MA 02115 USA
[3] Univ Minnesota, Sch Stat, Minneapolis, MN 55455 USA
关键词
Goodness of fit; Hosmer-Lemeshow test; Model assessment; Model selection diagnostics; GOODNESS-OF-FIT; CROSS-VALIDATION; SELECTION; TESTS;
D O I
10.1111/biom.12969
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Assessing a binary regression model based on ungrouped data is a commonly encountered but very challenging problem. Although tests, such as Hosmer-Lemeshow test and le Cessie-van Houwelingen test, have been devised and widely used in applications, they often have low power in detecting lack of fit and not much theoretical justification has been made on when they can work well. In this article, we propose a new approach based on a cross-validation voting system to address the problem. In addition to a theoretical guarantee that the probabilities of type I and II errors both converge to zero as the sample size increases for the new method under proper conditions, our simulation results demonstrate that it performs very well.
引用
收藏
页码:5 / 12
页数:8
相关论文
共 26 条
[1]  
Agresti A., 2002, Categorical data analysis
[2]  
Akaike H., 1998, 2 INT S INF THEOR, P199, DOI 10.1007/978-1-4612-1694-015
[3]  
[Anonymous], 1990, SANKHYA INDIAN J S A
[4]  
[Anonymous], 2004, Stat Appl Genet Mol Biol, DOI [DOI 10.2202/1544-6115.1041, 10.2202/1544-6115.1042]
[5]   A survey of cross-validation procedures for model selection [J].
Arlot, Sylvain ;
Celisse, Alain .
STATISTICS SURVEYS, 2010, 4 :40-79
[6]   Testing goodness-of-fit in logistic case-control studies [J].
Bondell, Howard D. .
BIOMETRIKA, 2007, 94 (02) :487-495
[7]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[8]   Local maximum likelihood estimation and inference [J].
Fan, JQ ;
Farmen, M ;
Gijbels, I .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1998, 60 :591-608
[9]   PREDICTIVE SAMPLE REUSE METHOD WITH APPLICATIONS [J].
GEISSER, S .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1975, 70 (350) :320-328
[10]  
Harrell FE, 2015, SPRINGER SER STAT, DOI 10.1007/978-3-319-19425-7