A sparse version of the ridge logistic regression for large-scale text categorization

被引:28
|
作者
Aseervatham, Sujeevan [1 ]
Antoniadis, Anestis [2 ]
Gaussier, Eric [1 ]
Burlet, Michel [3 ]
Denneulin, Yves [4 ]
机构
[1] Univ Grenoble 1, LIG, F-38041 Grenoble 9, France
[2] Univ Grenoble 1, LJK, F-38041 Grenoble 9, France
[3] Univ Grenoble 1, Lab Leibniz, F-38031 Grenoble 1, France
[4] ENSIMAG, LIG, F-38330 Montbonnot St Martin, France
关键词
Logistic regression; Model selection; Text categorization; Large scale ategorization; REGULARIZATION; SELECTION; MODEL;
D O I
10.1016/j.patrec.2010.09.023
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The ridge logistic regression has successfully been used in text categorization problems and It has been shown to reach the same performance as the Support Vector Machine but with the main advantage of computing a probability value rather than a score However the dense solution of the ridge makes its use unpractical for large scale categorization On the other side LASSO regularization is able to produce sparse solutions but its performance is dominated by the ridge when the number of features is larger than the number of observations and/or when the features are highly correlated In this paper we propose a new model selection method which tries to approach the ridge solution by a sparse solution The method first computes the ridge solution and then performs feature selection The experimental evaluations show that our method gives a solution which is a good trade-off between the ridge and LASSO solutions (C) 2010 Elsevier B V All rights reserved
引用
收藏
页码:101 / 106
页数:6
相关论文
共 50 条
  • [41] Design and implementation of a large-scale multi-class text classifier
    于水
    张亮
    马范援
    Journal of Harbin Institute of Technology, 2005, (06) : 690 - 695
  • [42] A Piecewise Linear Regression Model Ensemble for Large-Scale Curve Fitting
    Moreno-Carbonell, Santiago
    Sanchez-Ubeda, Eugenio F.
    ALGORITHMS, 2024, 17 (04)
  • [43] Sparse Large-Scale Nonlinear Dynamical Modeling of Human Hippocampus for Memory Prostheses
    Song, Dong
    Robinson, Brian S.
    Hampson, Robert E.
    Marmarelis, Vasilis Z.
    Deadwyler, Sam A.
    Berger, Theodore W.
    IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2018, 26 (02) : 272 - 280
  • [44] Linear-Time Algorithm for Learning Large-Scale Sparse Graphical Models
    Fattahi, Salar
    Zhang, Richard Y.
    Sojoudi, Somayeh
    IEEE ACCESS, 2019, 7 : 12658 - 12672
  • [45] Two-Stage Nonnegative Sparse Representation for Large-Scale Face Recognition
    He, Ran
    Zheng, Wei-Shi
    Hu, Bao-Gang
    Kong, Xiang-Wei
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2013, 24 (01) : 35 - 46
  • [46] Regularized and Sparse Stochastic K-Means for Distributed Large-Scale Clustering
    Jumutc, Vilen
    Langone, Rocco
    Suykens, Johan A. K.
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 2535 - 2540
  • [47] Sparse generalized principal component analysis for large-scale applications beyond Gaussianity
    Zhang, Qiaoya
    She, Yiyuan
    STATISTICS AND ITS INTERFACE, 2016, 9 (04) : 521 - 533
  • [48] Accounting for large-scale factors in the study of understory vegetation using a conditional logistic model
    Kuhlmann-Berenzon, Sharon
    Hjorth, Urban
    ENVIRONMENTAL AND ECOLOGICAL STATISTICS, 2007, 14 (02) : 149 - 159
  • [49] Benefits of sparse population sampling in multi-objective evolutionary computing for large-Scale sparse optimization problems
    Kropp, Ian
    Nejadhashemi, A. Pouyan
    Deb, Kalyanmoy
    SWARM AND EVOLUTIONARY COMPUTATION, 2022, 69
  • [50] Accounting for large-scale factors in the study of understory vegetation using a conditional logistic model
    Sharon Kühlmann-Berenzon
    Urban Hjorth
    Environmental and Ecological Statistics, 2007, 14 : 149 - 159