Feature Selection and Cancer Classification via Sparse Logistic Regression with the Hybrid L1/2+2 Regularization

被引:58
作者
Huang, Hai-Hui
Liu, Xiao-Ying
Liang, Yong [1 ]
机构
[1] Macau Univ Sci & Technol, Fac Informat Technol, Ave Wai Long, Taipa 999078, Macau, Peoples R China
来源
PLOS ONE | 2016年 / 11卷 / 05期
关键词
VARIABLE SELECTION; L-1/2; REGULARIZATION; LUNG; IDENTIFICATION; BIOMARKER; RECEPTOR; NETWORK;
D O I
10.1371/journal.pone.0149675
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Cancer classification and feature (gene) selection plays an important role in knowledge discovery in genomic data. Although logistic regression is one of the most popular classification methods, it does not induce feature selection. In this paper, we presented a new hybrid L1/2 + 2 regularization (HLR) function, a linear combination of L-1/2 and L-2 penalties, to select the relevant gene in the logistic regression. The HLR approach inherits some fascinating characteristics from L-1/2 (sparsity) and L-2 (grouping effect where highly correlated variables are in or out a model together) penalties. We also proposed a novel univariate HLR thresholding approach to update the estimated coefficients and developed the coordinate descent algorithm for the HLR penalized logistic regression model. The empirical results and simulations indicate that the proposed method is highly competitive amongst several state-of-the-art methods.
引用
收藏
页数:15
相关论文
共 41 条
  • [1] Selection bias in gene extraction on the basis of microarray gene-expression data
    Ambroise, C
    McLachlan, GJ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (10) : 6562 - 6566
  • [2] Down-regulation of the receptor for advanced glycation end-products (RAGE) supports non-small cell lung carcinoma
    Bartling, B
    Hofmann, HS
    Weigle, B
    Silber, RE
    Simm, A
    [J]. CARCINOGENESIS, 2005, 26 (02) : 293 - 301
  • [3] Elastic SCAD as a novel penalization method for SVM classification tasks in high-dimensional data
    Becker, Natalia
    Toedt, Grischa
    Lichter, Peter
    Benner, Axel
    [J]. BMC BIOINFORMATICS, 2011, 12
  • [4] The Receptor for Advanced Glycation End Products (RAGE) and the Lung
    Buckley, Stephen T.
    Ehrhardt, Carsten
    [J]. JOURNAL OF BIOMEDICINE AND BIOTECHNOLOGY, 2010,
  • [5] Candes E, 2007, ANN STAT, V35, P2313, DOI 10.1214/009053606000001523
  • [6] Optimization Based Tumor Classification from Microarray Gene Expression Data
    Dagliyan, Onur
    Uney-Yuksektepe, Fadime
    Kavakli, I. Halil
    Turkay, Metin
    [J]. PLOS ONE, 2011, 6 (02):
  • [7] Variable selection via nonconcave penalized likelihood and its oracle properties
    Fan, JQ
    Li, RZ
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) : 1348 - 1360
  • [8] Regularization Paths for Generalized Linear Models via Coordinate Descent
    Friedman, Jerome
    Hastie, Trevor
    Tibshirani, Rob
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2010, 33 (01): : 1 - 22
  • [9] REGULARIZED DISCRIMINANT-ANALYSIS
    FRIEDMAN, JH
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1989, 84 (405) : 165 - 175
  • [10] Using Rule-Based Machine Learning for Candidate Disease Gene Prioritization and Sample Classification of Cancer Gene Expression Data
    Glaab, Enrico
    Bacardit, Jaume
    Garibaldi, Jonathan M.
    Krasnogor, Natalio
    [J]. PLOS ONE, 2012, 7 (07):