A sparse version of the ridge logistic regression for large-scale text categorization

被引:28
|
作者
Aseervatham, Sujeevan [1 ]
Antoniadis, Anestis [2 ]
Gaussier, Eric [1 ]
Burlet, Michel [3 ]
Denneulin, Yves [4 ]
机构
[1] Univ Grenoble 1, LIG, F-38041 Grenoble 9, France
[2] Univ Grenoble 1, LJK, F-38041 Grenoble 9, France
[3] Univ Grenoble 1, Lab Leibniz, F-38031 Grenoble 1, France
[4] ENSIMAG, LIG, F-38330 Montbonnot St Martin, France
关键词
Logistic regression; Model selection; Text categorization; Large scale ategorization; REGULARIZATION; SELECTION; MODEL;
D O I
10.1016/j.patrec.2010.09.023
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The ridge logistic regression has successfully been used in text categorization problems and It has been shown to reach the same performance as the Support Vector Machine but with the main advantage of computing a probability value rather than a score However the dense solution of the ridge makes its use unpractical for large scale categorization On the other side LASSO regularization is able to produce sparse solutions but its performance is dominated by the ridge when the number of features is larger than the number of observations and/or when the features are highly correlated In this paper we propose a new model selection method which tries to approach the ridge solution by a sparse solution The method first computes the ridge solution and then performs feature selection The experimental evaluations show that our method gives a solution which is a good trade-off between the ridge and LASSO solutions (C) 2010 Elsevier B V All rights reserved
引用
收藏
页码:101 / 106
页数:6
相关论文
共 50 条
  • [31] AN EFFICIENT PROXIMAL BLOCK COORDINATE HOMOTOPY METHOD FOR LARGE-SCALE SPARSE LEAST SQUARES PROBLEMS
    Wang, Guoqiang
    Wei, Xinyuan
    Yu, Bo
    Xu, Lijun
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2020, 42 (01) : A395 - A423
  • [32] Sparse cluster analysis of large-scale discrete variables with application to single nucleotide polymorphism data
    Wu, Baolin
    JOURNAL OF APPLIED STATISTICS, 2013, 40 (02) : 358 - 367
  • [33] Decoupling sparse coding of SIFT descriptors for large-scale visual recognition
    Ji, Zhengping
    Theiler, James
    Chartrand, Rick
    Kenyon, Garrett
    Brumby, Steven P.
    INDEPENDENT COMPONENT ANALYSES, COMPRESSIVE SAMPLING, WAVELETS, NEURAL NET, BIOSYSTEMS, AND NANOENGINEERING XI, 2013, 8750
  • [34] An efficient sparse approach to sensitivity generation for large-scale dynamic optimization
    Barz, Tilman
    Kuntsche, Stefan
    Wozny, Guenter
    Arellano-Garcia, Harvey
    COMPUTERS & CHEMICAL ENGINEERING, 2011, 35 (10) : 2053 - 2065
  • [35] Sparse Identification and Estimation of Large-Scale Vector AutoRegressive Moving Averages
    Wilms, Ines
    Basu, Sumanta
    Bien, Jacob
    Matteson, David S.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023, 118 (541) : 571 - 582
  • [36] The sensitivity of the Seychelles-Chagos thermocline ridge to large-scale wind anomalies
    Hermes, Juliet C.
    Reason, Chris J. C.
    ICES JOURNAL OF MARINE SCIENCE, 2009, 66 (07) : 1455 - 1466
  • [37] A Large-Scale Frontal Vehicle Image Dataset for Fine-Grained Vehicle Categorization
    Lu, Lei
    Wang, Ping
    Huang, Hua
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (03) : 1818 - 1828
  • [38] Privacy-preserving categorization of mobile applications based on large-scale usage data
    He, Yongzhong
    Wang, Chao
    Xu, Guangquan
    Lian, Wenjuan
    Xian, Hequn
    Wang, Wei
    INFORMATION SCIENCES, 2020, 514 : 557 - 570
  • [39] Large-Scale Sparse Inverse Covariance Estimation via Thresholding and Max-Det Matrix Completion
    Zhang, Richard Y.
    Fattahi, Salar
    Sojoudi, Somayeh
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [40] Solving Large-Scale Multiobjective Optimization Problems With Sparse Optimal Solutions via Unsupervised Neural Networks
    Tian, Ye
    Lu, Chang
    Zhang, Xingyi
    Tan, Kay Chen
    Jin, Yaochu
    IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (06) : 3115 - 3128