A sparse version of the ridge logistic regression for large-scale text categorization

被引:28
|
作者
Aseervatham, Sujeevan [1 ]
Antoniadis, Anestis [2 ]
Gaussier, Eric [1 ]
Burlet, Michel [3 ]
Denneulin, Yves [4 ]
机构
[1] Univ Grenoble 1, LIG, F-38041 Grenoble 9, France
[2] Univ Grenoble 1, LJK, F-38041 Grenoble 9, France
[3] Univ Grenoble 1, Lab Leibniz, F-38031 Grenoble 1, France
[4] ENSIMAG, LIG, F-38330 Montbonnot St Martin, France
关键词
Logistic regression; Model selection; Text categorization; Large scale ategorization; REGULARIZATION; SELECTION; MODEL;
D O I
10.1016/j.patrec.2010.09.023
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The ridge logistic regression has successfully been used in text categorization problems and It has been shown to reach the same performance as the Support Vector Machine but with the main advantage of computing a probability value rather than a score However the dense solution of the ridge makes its use unpractical for large scale categorization On the other side LASSO regularization is able to produce sparse solutions but its performance is dominated by the ridge when the number of features is larger than the number of observations and/or when the features are highly correlated In this paper we propose a new model selection method which tries to approach the ridge solution by a sparse solution The method first computes the ridge solution and then performs feature selection The experimental evaluations show that our method gives a solution which is a good trade-off between the ridge and LASSO solutions (C) 2010 Elsevier B V All rights reserved
引用
收藏
页码:101 / 106
页数:6
相关论文
共 50 条
  • [21] Regression testing approach for large-scale systems
    Kandil, Passant
    Moussa, Sherin
    Badr, Nagwa
    2014 IEEE INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING WORKSHOPS (ISSREW), 2014, : 132 - 133
  • [22] Optimal subsampling for large-scale quantile regression
    Ai, Mingyao
    Wang, Fei
    Yu, Jun
    Zhang, Huiming
    JOURNAL OF COMPLEXITY, 2021, 62
  • [23] A fast and scalable framework for large-scale and ultrahigh-dimensional sparse regression with application to the UK Biobank
    Qian, Junyang
    Tanigawa, Yosuke
    Du, Wenfei
    Aguirre, Matthew
    Chang, Chris
    Tibshirani, Robert
    Rivas, Manuel A.
    Hastie, Trevor
    PLOS GENETICS, 2020, 16 (10):
  • [24] A LARGE SCALE ANALYSIS OF LOGISTIC REGRESSION: ASYMPTOTIC PERFORMANCE AND NEW INSIGHTS
    Mai, Xiaoyi
    Liao, Zhenyu
    Couillet, Romain
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3357 - 3361
  • [25] Sampling Lasso quantile regression for large-scale data
    Xu, Qifa
    Cai, Chao
    Jiang, Cuixia
    Sun, Fang
    Huang, Xue
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2018, 47 (01) : 92 - 114
  • [26] Large-Scale Aerial Image Categorization Using a Multitask Topological Codebook
    Zhang, Luming
    Wang, Meng
    Hong, Richang
    Yin, Bao-Cai
    Li, Xuelong
    IEEE TRANSACTIONS ON CYBERNETICS, 2016, 46 (02) : 535 - 545
  • [27] An empirical comparison of min–max-modular k-NN with different voting methods to large-scale text categorization
    Ke Wu
    Bao-Liang Lu
    Masao Utiyama
    Hitoshi Isahara
    Soft Computing, 2008, 12 : 647 - 655
  • [28] An empirical comparison of min-max-modular k-NN with different voting methods to large-scale text categorization
    Wu, Ke
    Lu, Bao-Liang
    Utiyama, Masao
    Isahara, Hitoshi
    SOFT COMPUTING, 2008, 12 (07) : 647 - 655
  • [29] Data-Driven Robust and Sparse Solutions for Large-scale Fuzzy Portfolio Optimization
    Yu, Na
    Liang, You
    Thavaneswaran, A.
    2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
  • [30] Large-scale regression with non-convex loss and penalty
    Buccini, Alessandro
    Cabrera, Omar De la Cruz
    Donatelli, Marco
    Martinelli, Andrea
    Reichel, Lothar
    APPLIED NUMERICAL MATHEMATICS, 2020, 157 : 590 - 601