A LARGE SCALE ANALYSIS OF LOGISTIC REGRESSION: ASYMPTOTIC PERFORMANCE AND NEW INSIGHTS

被引:0
|
作者
Mai, Xiaoyi [1 ,2 ]
Liao, Zhenyu [1 ,2 ]
Couillet, Romain [1 ,2 ]
机构
[1] Univ Paris Saclay, Cent Supelec, St Aubin, France
[2] Univ Grenoble Alpes, GIPSA Lab, Grenoble, France
来源
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2019年
关键词
High dimensional statistic; logistic regression; machine learning; random matrix theory;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Logistic regression, one of the most popular machine learning binary classification methods, has been long believed to be unbiased. In this paper, we consider the "hard" classification problem of separating high dimensional Gaussian vectors, where the data dimension p and the sample size n are both large. Based on recent advances in random matrix theory (RMT) and high dimensional statistics, we evaluate the asymptotic distribution of the logistic regression classifier and consequently, provide the associated classification performance. This brings new insights into the internal mechanism of logistic regression classifier, including a possible bias in the separating hyperplane, as well as on practical issues such as hyper-parameter tuning, thereby opening the door to novel RMT-inspired improvements.
引用
收藏
页码:3357 / 3361
页数:5
相关论文
共 50 条
  • [21] Logistic Regression Ensemble for Predicting Customer Defection with Very Large Sample Size
    Kuswanto, Heri
    Asfihani, Ayu
    Sarumaha, Yogi
    Ohwada, Hayato
    THIRD INFORMATION SYSTEMS INTERNATIONAL CONFERENCE 2015, 2015, 72 : 86 - 93
  • [22] LOGISTIC REGRESSION ANALYSIS WITH STANDARDIZED MARKERS
    Huang, Ying
    Pepe, Margaret S.
    Feng, Ziding
    ANNALS OF APPLIED STATISTICS, 2013, 7 (03) : 1640 - 1662
  • [23] Empirical logit analysis is not logistic regression
    Donnelly, Seamus
    Verkuilen, Jay
    JOURNAL OF MEMORY AND LANGUAGE, 2017, 94 : 28 - 42
  • [24] Further Improving the Performance of Logistic Regression Analysis Using Double Extreme Ranking
    Samawi, Hani M.
    Zhang, Xinyan
    Rochani, Haresh
    JOURNAL OF STATISTICAL THEORY AND PRACTICE, 2020, 14 (01)
  • [25] Implementation of logistic regression into technical analysis
    Hruska, Juraj
    MATHEMATICAL METHODS IN ECONOMICS 2013, PTS I AND II, 2013, : 297 - 302
  • [26] One-pass Logistic Regression for Label-drift and Large-scale Classification on Distributed Systems
    Vu Nguyen
    Tu Dinh Nguyen
    Trung Le
    Venkatesh, Svetha
    Dinh Phung
    2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2016, : 1113 - 1118
  • [27] MODEL FOR THE PREDICTION OF PERFORMANCE WITH ECOINNOVATION CAPABILITY DEVELOPMENT CRITERIA: A MILITARY LOGISTIC REGRESSION ANALYSIS
    Badila, M., I
    Barsan, G.
    Cioca, L., I
    POLISH JOURNAL OF MANAGEMENT STUDIES, 2024, 29 (02): : 87 - 120
  • [28] Logistic regression in large rare events and imbalanced data: A performance comparison of prior correction and weighting methods
    Maalouf, Maher
    Homouz, Dirar
    Trafalis, Theodore B.
    COMPUTATIONAL INTELLIGENCE, 2018, 34 (01) : 161 - 174
  • [29] Cluster and Logistic Regression Distribution of Students' Performance by Classification
    Soomro, Nareena
    Razaque, Fahad
    Soomro, Safeeullah
    Shaikh, Shoaib
    Kumar, Natesh
    Abro, Ghulam E. Mustafa
    Abid, Ghulam
    EMERGING TECHNOLOGIES IN COMPUTING, ICETIC 2018, 2018, 200 : 209 - 219
  • [30] A Performance Analysis of Logistic Regression and Support Vector Machine Classifiers for Spoof Fingerprint Detection
    Ibrahim, Yusuf
    Mu'azu, Muhammed. B.
    Adedokun, Adewale E.
    Sha'aban, Yusuf A.
    2017 IEEE 3RD INTERNATIONAL CONFERENCE ON ELECTRO-TECHNOLOGY FOR NATIONAL DEVELOPMENT (NIGERCON), 2017, : 1 - 5