THE PHASE TRANSITION FOR THE EXISTENCE OF THE MAXIMUM LIKELIHOOD ESTIMATE IN HIGH-DIMENSIONAL LOGISTIC REGRESSION

被引:59
作者
Candes, Emmanuel J. [1 ,2 ]
Sur, Pragya [2 ]
机构
[1] Stanford Univ, Dept Math, Stanford, CA 94305 USA
[2] Harvard Univ, Harvard John A Paulson Sch Engn & Appl Sci, Cambridge, MA 02138 USA
关键词
High-dimensional logistic regression; MLE phase transition;
D O I
10.1214/18-AOS1789
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This paper rigorously establishes that the existence of the maximum likelihood estimate (MLE) in high-dimensional logistic regression models with Gaussian covariates undergoes a sharp "phase transition." We introduce an explicit boundary curve h(MLE), parameterized by two scalars measuring the overall magnitude of the unknown sequence of regression coefficients, with the following property: in the limit of large sample sizes n and number of features p proportioned in such a way that p/n -> kappa, we show that if the problem is sufficiently high dimensional in the sense that kappa > h(MLE), then the MLE does not exist with probability one. Conversely, if kappa < h(MLE), the MLE asymptotically exists with probability one.
引用
收藏
页码:27 / 42
页数:16
相关论文
共 21 条
[1]  
ALBERT A, 1984, BIOMETRIKA, V71, P1
[2]   Living on the edge: phase transitions in convex programs with random data [J].
Amelunxen, Dennis ;
Lotz, Martin ;
McCoy, Michael B. ;
Tropp, Joel A. .
INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2014, 3 (03) :224-294
[3]  
[Anonymous], 1989, GEN LINEAR MODELS
[4]   A NEW PERSPECTIVE ON LEAST SQUARES UNDER CONVEX CONSTRAINT [J].
Chatterjee, Sourav .
ANNALS OF STATISTICS, 2014, 42 (06) :2340-2381
[5]   Measuring overlap in binary regression [J].
Christmann, A ;
Rousseeuw, PJ .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2001, 37 (01) :65-75
[6]  
COVER T. M., 1964, THESIS
[7]   GEOMETRICAL AND STATISTICAL PROPERTIES OF SYSTEMS OF LINEAR INEQUALITIES WITH APPLICATIONS IN PATTERN RECOGNITION [J].
COVER, TM .
IEEE TRANSACTIONS ON ELECTRONIC COMPUTERS, 1965, EC14 (03) :326-&
[8]  
GORDON Y, 1988, LECT NOTES MATH, V1317, P84
[9]  
Kaufmann H., 1988, ZOR, Methods and Models of Operations Research, V32, P357, DOI 10.1007/BF01920035
[10]   Infinite parameter estimates in logistic regression, with application to approximate conditional inference [J].
Kolassa, JE .
SCANDINAVIAN JOURNAL OF STATISTICS, 1997, 24 (04) :523-530