A LARGE SCALE ANALYSIS OF LOGISTIC REGRESSION: ASYMPTOTIC PERFORMANCE AND NEW INSIGHTS

被引:0
|
作者
Mai, Xiaoyi [1 ,2 ]
Liao, Zhenyu [1 ,2 ]
Couillet, Romain [1 ,2 ]
机构
[1] Univ Paris Saclay, Cent Supelec, St Aubin, France
[2] Univ Grenoble Alpes, GIPSA Lab, Grenoble, France
来源
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2019年
关键词
High dimensional statistic; logistic regression; machine learning; random matrix theory;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Logistic regression, one of the most popular machine learning binary classification methods, has been long believed to be unbiased. In this paper, we consider the "hard" classification problem of separating high dimensional Gaussian vectors, where the data dimension p and the sample size n are both large. Based on recent advances in random matrix theory (RMT) and high dimensional statistics, we evaluate the asymptotic distribution of the logistic regression classifier and consequently, provide the associated classification performance. This brings new insights into the internal mechanism of logistic regression classifier, including a possible bias in the separating hyperplane, as well as on practical issues such as hyper-parameter tuning, thereby opening the door to novel RMT-inspired improvements.
引用
收藏
页码:3357 / 3361
页数:5
相关论文
共 50 条
  • [41] A New Edge Detector Based on SMOTE and Logistic Regression
    Fernandez-Peralta, Raquel
    Massanet, Sebastia
    Mir, Arnau
    ADVANCES IN FUZZY LOGIC AND TECHNOLOGY 2017, VOL 2, 2018, 642 : 48 - 57
  • [42] Predictive Performance of Logistic Regression for Imbalanced Data with Categorical Covariate
    Abd Rahman, Hezlin Aryani
    Wah, Yap Bee
    Huat, Ong Seng
    PERTANIKA JOURNAL OF SCIENCE AND TECHNOLOGY, 2020, 28 (04): : 1141 - 1161
  • [43] Predictive Performance of Logistic Regression for Imbalanced Data with Categorical Covariate
    Abd Rahman, Hezlin Aryani
    Wah, Yap Bee
    Huat, Ong Seng
    PERTANIKA JOURNAL OF SCIENCE AND TECHNOLOGY, 2021, 29 (01): : 181 - 197
  • [44] Factors Affecting Performance in Cooperative Terengganu By Using Logistic Regression
    Safiar, Nor Bazilah
    Ahmad, Sabri
    Yacob, Jusoh
    MALAYSIAN JOURNAL OF FUNDAMENTAL AND APPLIED SCIENCES, 2012, 8 (04): : 256 - 259
  • [45] Assessing the performance of variational methods for mixed logistic regression models
    Rijmen, Frank
    Vomlel, Jiri
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2008, 78 (08) : 765 - 779
  • [46] LEAST MEDIAN OF WEIGHTED SQUARES IN LOGISTIC-REGRESSION WITH LARGE STRATA
    CHRISTMANN, A
    BIOMETRIKA, 1994, 81 (02) : 413 - 417
  • [47] Adoption of agroforestry in the hills of Nepal: a logistic regression analysis
    Neupane, RP
    Sharma, KR
    Thapa, GB
    AGRICULTURAL SYSTEMS, 2002, 72 (03) : 177 - 196
  • [48] An Improved Lexicon using Logistic Regression for Sentiment Analysis
    Bhargava, Kunal
    Katarya, Rahul
    2017 INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES FOR SMART NATION (IC3TSN), 2017, : 332 - 337
  • [49] Face recognition based on PCA and logistic regression analysis
    Zhou, Changjun
    Wang, Lan
    Zhang, Qiang
    Wei, Xiaopeng
    OPTIK, 2014, 125 (20): : 5916 - 5919
  • [50] Performance Drop Detector based on Bayesian Network and Logistic Regression
    Zhang, Rui
    Zhang, Xiaojuan
    Zhang, Zhihua
    Xie, Shuangyuan
    Wang, Zuyuan
    Wu, Tiangang
    2018 INTERNATIONAL JOINT CONFERENCE ON INFORMATION, MEDIA AND ENGINEERING (ICIME), 2018, : 288 - 291