Lung Cancer Classification and Gene Selection by Combining Affinity Propagation Clustering and Sparse Group Lasso

被引:8
作者
Li, Juntao [1 ]
Chang, Mingming [1 ]
Gao, Qinghui [1 ]
Song, Xuekun [2 ]
Gao, Zhiyu [2 ]
机构
[1] Henan Normal Univ, Coll Math & Informat Sci, Xinxiang 453007, Henan, Peoples R China
[2] Henan Univ Chinese Med, Sch Informat Technol, Zhengzhou 450046, Peoples R China
关键词
Lung cancer; gene selection; affinity propagation clustering; sparse group lasso; multi-classification; miRNA; MOLECULAR CLASSIFICATION; CLASS DISCOVERY; EXPRESSION; REGRESSION; REGULARIZATION; PREDICTION; MACHINE;
D O I
10.2174/1574893614666191017103557
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Cancer threatens human health seriously. Diagnosing cancer via gene expression analysis is a hot topic in cancer research. Objective: The study aimed to diagnose the accurate type of lung cancer and discover the pathogenic genes. Methods: In this study, Affinity Propagation (AP) clustering with similarity score was employed to each type of lung cancer and normal lung. After grouping genes, sparse group lasso was adopted to construct four binary classifiers and the voting strategy was used to integrate them. Results: This study screened six gene groups that may associate with different lung cancer sub-types among 73 genes groups, and identified three possible key pathogenic genes, KRAS, BRAF and VDR. Furthermore, this study achieved improved classification accuracies at minority classes SQ and COID in comparison with other four methods. Conclusion: We propose the AP clustering based sparse group lasso (AP-SGL), which provides an alternative for simultaneous diagnosis and gene selection for lung cancer.
引用
收藏
页码:703 / 712
页数:10
相关论文
共 40 条
[11]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[12]  
Gordon GJ, 2002, CANCER RES, V62, P4963
[13]   Lung cancer gene expression database analysis incorporating prior knowledge with support vector machine-based classification method [J].
Guan, Peng ;
Huang, Desheng ;
He, Miao ;
Zhou, Baosen .
JOURNAL OF EXPERIMENTAL & CLINICAL CANCER RESEARCH, 2009, 28
[14]   Gene selection for cancer classification using support vector machines [J].
Guyon, I ;
Weston, J ;
Barnhill, S ;
Vapnik, V .
MACHINE LEARNING, 2002, 46 (1-3) :389-422
[15]   Hallmarks of Cancer: The Next Generation [J].
Hanahan, Douglas ;
Weinberg, Robert A. .
CELL, 2011, 144 (05) :646-674
[16]   Quantile regression with group lasso for classification [J].
Hashem, Hussein ;
Vinciotti, Veronica ;
Alhamzawi, Rahim ;
Yu, Keming .
ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2016, 10 (03) :375-390
[17]  
Hastie, 2013, ARXIV13116529
[18]   Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks [J].
Khan, J ;
Wei, JS ;
Ringnér, M ;
Saal, LH ;
Ladanyi, M ;
Westermann, F ;
Berthold, F ;
Schwab, M ;
Antonescu, CR ;
Peterson, C ;
Meltzer, PS .
NATURE MEDICINE, 2001, 7 (06) :673-679
[19]   Temporal clustering by affinity propagation reveals transcriptional modules in Arabidopsis thaliana [J].
Kiddle, Steven J. ;
Windram, Oliver P. F. ;
McHattie, Stuart ;
Mead, Andrew ;
Beynon, Jim ;
Buchanan-Wollaston, Vicky ;
Denby, Katherine J. ;
Mukherjee, Sach .
BIOINFORMATICS, 2010, 26 (03) :355-362
[20]   Clustering by soft-constraint affinity propagation: applications to gene-expression data [J].
Leone, Michele ;
Sumedha ;
Weigt, Martin .
BIOINFORMATICS, 2007, 23 (20) :2708-2715