A sparse version of the ridge logistic regression for large-scale text categorization

被引:28
|
作者
Aseervatham, Sujeevan [1 ]
Antoniadis, Anestis [2 ]
Gaussier, Eric [1 ]
Burlet, Michel [3 ]
Denneulin, Yves [4 ]
机构
[1] Univ Grenoble 1, LIG, F-38041 Grenoble 9, France
[2] Univ Grenoble 1, LJK, F-38041 Grenoble 9, France
[3] Univ Grenoble 1, Lab Leibniz, F-38031 Grenoble 1, France
[4] ENSIMAG, LIG, F-38330 Montbonnot St Martin, France
关键词
Logistic regression; Model selection; Text categorization; Large scale ategorization; REGULARIZATION; SELECTION; MODEL;
D O I
10.1016/j.patrec.2010.09.023
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The ridge logistic regression has successfully been used in text categorization problems and It has been shown to reach the same performance as the Support Vector Machine but with the main advantage of computing a probability value rather than a score However the dense solution of the ridge makes its use unpractical for large scale categorization On the other side LASSO regularization is able to produce sparse solutions but its performance is dominated by the ridge when the number of features is larger than the number of observations and/or when the features are highly correlated In this paper we propose a new model selection method which tries to approach the ridge solution by a sparse solution The method first computes the ridge solution and then performs feature selection The experimental evaluations show that our method gives a solution which is a good trade-off between the ridge and LASSO solutions (C) 2010 Elsevier B V All rights reserved
引用
收藏
页码:101 / 106
页数:6
相关论文
共 50 条
  • [1] Large-Scale Sparse Logistic Regression
    Liu, Jun
    Chen, Jianhui
    Ye, Jieping
    KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2009, : 547 - 555
  • [2] Weighted logistic regression for large-scale imbalanced and rare events data
    Maalouf, Maher
    Siddiqi, Mohammad
    KNOWLEDGE-BASED SYSTEMS, 2014, 59 : 142 - 148
  • [3] Trust region Newton method for large-scale logistic regression
    Lin, Chih-Jen
    Weng, Ruby C.
    Keerthi, S. Sathiya
    JOURNAL OF MACHINE LEARNING RESEARCH, 2008, 9 : 627 - 650
  • [4] Large-Scale Linguistic Ontology as a Basis for Text Categorization of Legislative Documents
    Loukachevitch, Natalia
    Dobrov, Boris
    LEGAL KNOWLEDGE AND INFORMATION SYSTEMS, 2005, 134 : 109 - 110
  • [5] LARGE-SCALE MULTIVARIATE SPARSE REGRESSION WITH APPLICATIONS TO UK BIOBANK
    Qian, Junyang
    Tanigawa, Yosuke
    Li, Ruilin
    Tibshirani, Robert
    Rivas, Manuel A.
    Hastie, Trevor
    ANNALS OF APPLIED STATISTICS, 2022, 16 (03) : 1891 - 1918
  • [6] A Fast Hybrid Algorithm for Large-Scale l1-Regularized Logistic Regression
    Shi, Jianing
    Yin, Wotao
    Osher, Stanley
    Sajda, Paul
    JOURNAL OF MACHINE LEARNING RESEARCH, 2010, 11 : 713 - 741
  • [7] Random forest versus logistic regression: a large-scale benchmark experiment
    Couronne, Raphael
    Probst, Philipp
    Boulesteix, Anne-Laure
    BMC BIOINFORMATICS, 2018, 19
  • [8] Random forest versus logistic regression: a large-scale benchmark experiment
    Raphael Couronné
    Philipp Probst
    Anne-Laure Boulesteix
    BMC Bioinformatics, 19
  • [9] When Homomorphic Encryption Marries Secret Sharing: Secure Large-Scale Sparse Logistic Regression and Applications in Risk Control
    Chen, Chaochao
    Zhou, Jun
    Wang, Li
    Wu, Xibin
    Fang, Wenjing
    Tan, Jin
    Wang, Lei
    Liu, Alex X.
    Wang, Hao
    Hong, Cheng
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 2652 - 2662
  • [10] A logistic regression-based smoothing method for Chinese text categorization
    Yen, Show-Jane
    Lee, Yue-Shi
    Ying, Jia-Ching
    Wu, Yu-Chieh
    EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (09) : 11581 - 11590