Scalable holistic linear regression

被引:12
作者
Bertsimas, Dimitris [1 ,2 ]
Li, Michael Lingzhi [2 ]
机构
[1] MIT, Sloan Sch Management, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] MIT, Operat Res Ctr, 77 Massachusetts Ave, Cambridge, MA 02139 USA
关键词
Holistic linear regression; Multicollinearity and significance in linear regression; Mixed-integer optimization; SELECTION;
D O I
10.1016/j.orl.2020.02.008
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
We propose a new scalable algorithm for holistic linear regression building on Bertsimas & King (2016). Specifically, we develop new theory to model significance and multicollinearity as lazy constraints rather than checking the conditions iteratively. The resulting algorithm scales with the number of samples 11 in the 10,000s, compared to the low 100s in the previous framework. Computational results on real and synthetic datasets show it greatly improves from previous algorithms in accuracy, false detection rate, computational time and scalability. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:203 / 208
页数:6
相关论文
共 13 条
[1]   Characterization of the equivalence of robustification and regularization in linear and matrix regression [J].
Bertsimas, Dimitris ;
Copenhaver, Martin S. .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2018, 270 (03) :931-942
[2]   OR Forum-An Algorithmic Approach to Linear Regression [J].
Bertsimas, Dimitris ;
King, Angela .
OPERATIONS RESEARCH, 2016, 64 (01) :2-16
[3]  
Carrizosaa E., 2017, ENHANCING INTERPRETA
[4]  
Chung S., 2017, ARXIV171204543
[5]  
Dua D., 2020, UCI ml repository
[6]   ASYMPTOTIC NORMALITY AND CONSISTENCY OF LAEAST SQUARES ESTIMATORS FOR FAMILIES OF LINAR REGRESSIONS [J].
EICKER, F .
ANNALS OF MATHEMATICAL STATISTICS, 1963, 34 (02) :447-&
[7]   ANALYSIS AND SELECTION OF VARIABLES IN LINEAR-REGRESSION [J].
HOCKING, RR .
BIOMETRICS, 1976, 32 (01) :1-49
[8]   A note regarding the condition number: The case of spurious and latent multicollinearity [J].
Lazaridis, Alexis .
QUALITY & QUANTITY, 2007, 41 (01) :123-135
[9]   DETECTING MULTICOLLINEARITY [J].
MANSFIELD, ER ;
HELMS, BP .
AMERICAN STATISTICIAN, 1982, 36 (03) :158-160
[10]   A caution regarding rules of thumb for variance inflation factors [J].
O'Brien, Robert M. .
QUALITY & QUANTITY, 2007, 41 (05) :673-690