Regularized logistic regression without a penalty term: An application to cancer classification with microarray data

被引:45
作者
Bielza, Concha [1 ]
Robles, Victor [2 ]
Larranaga, Pedro [1 ]
机构
[1] Tech Univ Madrid, Dept Artificial Intelligence, Madrid, Spain
[2] Tech Univ Madrid, Dept Comp Architecture & Technol, Madrid, Spain
基金
美国国家卫生研究院;
关键词
Logistic regression; Regularization; Estimation of distribution algorithms; Cancer classification; Microarray data; PARTIAL LEAST-SQUARES; GENE SELECTION; DISEASE CLASSIFICATION; TUMOR CLASSIFICATION; REDUCTION; ALGORITHM; CLASSIFIERS; PREDICTION; WRAPPERS; LASSO;
D O I
10.1016/j.eswa.2010.09.140
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Regularized logistic regression is a useful classification method for problems with few samples and a huge number of variables. This regression needs to determine the regularization term, which amounts to searching for the optimal penalty parameter and the norm of the regression coefficient vector. This paper presents a new regularized logistic regression method based on the evolution of the regression coefficients using estimation of distribution algorithms. The main novelty is that it avoids the determination of the regularization term. The chosen simulation method of new coefficients at each step of the evolutionary process guarantees their shrinkage as an intrinsic regularization. Experimental results comparing the behavior of the proposed method with Lasso and ridge logistic regression in three cancer classification problems with microarray data are shown. (C) 2010 Elsevier Ltd. All rights reserved.
引用
收藏
页码:5110 / 5118
页数:9
相关论文
共 69 条
  • [1] Using principal components for estimating logistic regression with high-dimensional multicollinear data
    Aguilera, AM
    Escabias, M
    Valderrama, MJ
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 50 (08) : 1905 - 1924
  • [2] Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays
    Alon, U
    Barkai, N
    Notterman, DA
    Gish, K
    Ybarra, S
    Mack, D
    Levine, AJ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) : 6745 - 6750
  • [3] [Anonymous], L REGULARIZATION PAT
  • [4] [Anonymous], AUC OPTIMIZATION VS
  • [5] [Anonymous], J COMPUTATIONAL GRAP
  • [6] [Anonymous], 2006, Journal of the Royal Statistical Society, Series B
  • [7] [Anonymous], 2006, ADV ESTIMATION DISTR
  • [8] [Anonymous], 4 IEEE S BIOINF BIOE
  • [9] [Anonymous], LECT NOTES COMPUTER
  • [10] [Anonymous], WORKSH OPT BUILD US