A comparative investigation of methods for logistic regression with separated or nearly separated data

被引:301
作者
Heinze, Georg [1 ]
机构
[1] Med Univ Vienna, Sect Clin Biometr, Core Unit Med Stat & Informat, A-1090 Vienna, Austria
关键词
bias reduction; exact logistic regression; infinite estimates; modified score function; penalized likelihood; sparse data;
D O I
10.1002/sim.2687
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
In logistic regression analysis of small or sparse data sets, results obtained by classical maximum likelihood methods cannot be generally trusted. In such analyses it may even happen that the likelihood meets the convergence criteria while at least one parameter estimate diverges to +/-infinity. This situation has been termed,'separation', and it typically occurs whenever no events are observed in one of the two groups defined by a dichotomous covariate. More generally, separation is caused by a linear combination of continuous or dichotomous covariates that perfectly separates events from non-events. Separation implies infinite or zero maximum likelihood estimates of odds ratios, which are usually considered unrealistic. I provide some examples of separation and near-separation in clinical data sets and discuss some options to analyse such data, including exact logistic regression analysis and a penalized likelihood approach. Both methods supply finite point estimates in case of separation. Profile penalized likelihood confidence intervals for parameters show excellent behaviour in terms of coverage probability and provide higher power than exact confidence intervals. General advantages of the penalized likelihood approach are discussed. Copyright (c) 2006 John Wiley & Sons, Ltd.
引用
收藏
页码:4216 / 4226
页数:11
相关论文
共 34 条
[11]   On bias reduction in exponential and non-exponential family regression models [J].
Cordeiro, GM ;
Cribari-Neto, F .
COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 1998, 27 (02) :485-500
[12]  
*CYT SOFTW CORP, 2002, LOGX 5
[13]   A GENERAL MAXIMUM LIKELIHOOD DISCRIMINANT [J].
DAY, NE ;
KERRIDGE, DF .
BIOMETRICS, 1967, 23 (02) :313-&
[14]   BIAS REDUCTION OF MAXIMUM-LIKELIHOOD-ESTIMATES [J].
FIRTH, D .
BIOMETRIKA, 1993, 80 (01) :27-38
[15]   Engraftment syndrome after nonmyeloablative allogeneic hematopoietic stem cell transplantation: Incidence and effects on survival [J].
Gorak, E ;
Geller, N ;
Srinivasan, R ;
Espinoza-Delgado, I ;
Donohue, T ;
Barrett, AJ ;
Suffredini, A ;
Childs, R .
BIOLOGY OF BLOOD AND MARROW TRANSPLANTATION, 2005, 11 (07) :542-550
[16]   Relative hydrophobicity and lipophilicity of drugs measured by aqueous two-phase partitioning, octanol-buffer partitioning and HPLC. A simple model for predicting blood-brain distribution [J].
Gulyaeva, N ;
Zaslavsky, A ;
Lechner, P ;
Chlenov, M ;
McConnell, O ;
Chait, A ;
Kipnis, V ;
Zaslavsky, B .
EUROPEAN JOURNAL OF MEDICINAL CHEMISTRY, 2003, 38 (04) :391-396
[17]   WALDS TEST AS APPLIED TO HYPOTHESES IN LOGIT ANALYSIS [J].
HAUCK, WW ;
DONNER, A .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1977, 72 (360) :851-853
[18]   High expression of lipoprotein lipase in poor risk B-cell chronic lymphocytic leukemia [J].
Heintel, D ;
Kienle, D ;
Shehata, M ;
Kröber, A ;
Kroemer, E ;
Schwarzinger, I ;
Mitteregger, D ;
Le, T ;
Gleiss, A ;
Mannhalter, C ;
Chott, A ;
Schwarzmeier, J ;
Fonatsch, C ;
Gaiger, A ;
Döhner, H ;
Stilgenbauer, S ;
Jäger, U .
LEUKEMIA, 2005, 19 (07) :1216-1223
[19]   Fixing the nonconvergence bug in logistic regression with SPLUS and SAS [J].
Heinze, G ;
Ploner, M .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2003, 71 (02) :181-187
[20]   A solution to the problem of separation in logistic regression [J].
Heinze, G ;
Schemper, M .
STATISTICS IN MEDICINE, 2002, 21 (16) :2409-2419