Cancer Diagnosis and Disease Gene Identification via Statistical Machine Learning

被引:23
作者
Chen, Liuyuan [1 ,2 ]
Li, Juntao [1 ,2 ]
Chang, Mingming [1 ]
机构
[1] Henan Normal Univ, Journal Editorial Off, Xinxiang 453007, Henan, Peoples R China
[2] Henan Normal Univ, Coll Math & Informat Sci, Henan Engn Lab Big Data Stat Anal & Optimal Contr, Xinxiang 453007, Henan, Peoples R China
关键词
Cancer diagnosis; gene selection; machine learning; support vector machine; lasso; group lasso; SUPPORT VECTOR MACHINES; FEATURE-SELECTION; MICROARRAY CLASSIFICATION; VARIABLE SELECTION; MULTICLASS; PREDICTION; REGULARIZATION; REGRESSION;
D O I
10.2174/1574893615666200207094947
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Diagnosing cancer and identifying the disease gene by using DNA microarray gene expression data are the hot topics in current bioinformatics. This paper is devoted to the latest development in cancer diagnosis and gene selection via statistical machine learning. A support vector machine is firstly introduced for the binary cancer diagnosis. Then, 1-norm support vector machine, doubly regularized support vector machine, adaptive huberized support vector machine and other extensions are presented to improve the performance of gene selection. Lasso, elastic net, partly adaptive elastic net, group lasso, sparse group lasso, adaptive sparse group lasso and other sparse regression methods are also introduced for performing simultaneous binary cancer classification and gene selection. In addition to introducing three strategies for reducing multiclass to binary, methods of directly considering all classes of data in a learning model (multi_class support vector, sparse multinomial regression, adaptive multinomial regression and so on) are presented for performing multiple cancer diagnosis. Limitations and promising directions are also discussed.
引用
收藏
页码:956 / 962
页数:7
相关论文
共 63 条
[1]   A Machine Learning Approach for Identifying Gene Biomarkers Guiding the Treatment of Breast Cancer [J].
Abou Tabl, Ashraf ;
Alkhateeb, Abedalrhman ;
ElMaraghy, Waguih ;
Rueda, Luis ;
Ngom, Alioune .
FRONTIERS IN GENETICS, 2019, 10
[2]   Reducing multiclass to binary: A unifying approach for margin classifiers [J].
Allwein, EL ;
Schapire, RE ;
Singer, Y .
JOURNAL OF MACHINE LEARNING RESEARCH, 2001, 1 (02) :113-141
[3]   A review of microarray datasets and applied feature selection methods [J].
Bolon-Canedo, V. ;
Sanchez-Marono, N. ;
Alonso-Betanzos, A. ;
Benitez, J. M. ;
Herrera, F. .
INFORMATION SCIENCES, 2014, 282 :111-135
[4]   Gene selection in cancer classification using sparse logistic regression with Bayesian regularization [J].
Cawley, Gavin C. ;
Talbot, Nicola L. C. .
BIOINFORMATICS, 2006, 22 (19) :2348-2355
[5]   Identifying Methylation Pattern and Genes Associated with Breast Cancer Subtypes [J].
Chen, Lei ;
Zeng, Tao ;
Pan, Xiaoyong ;
Zhang, Yu-Hang ;
Huang, Tao ;
Cai, Yu-Dong .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2019, 20 (17)
[6]   Gene expression differences among different MSI statuses in colorectal cancer [J].
Chen, Lei ;
Pan, Xiaoyong ;
Hu, XiaoHua ;
Zhang, Yu-Hang ;
Wang, ShaoPeng ;
Huang, Tao ;
Cai, Yu-Dong .
INTERNATIONAL JOURNAL OF CANCER, 2018, 143 (07) :1731-1740
[7]   Analysis of cancer-related IncRNAs using gene ontology and KEGG pathways [J].
Chen, Lei ;
Zhang, Yu-Hang ;
Lu, Guohui ;
Huang, Tao ;
Cai, Yu-Dong .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2017, 76 :27-36
[8]   Informative gene selection and the direct classification of tumors based on relative simplicity [J].
Chen, Yuan ;
Wang, Lifeng ;
Li, Lanzhi ;
Zhang, Hongyan ;
Yuan, Zheming .
BMC BIOINFORMATICS, 2016, 17
[9]  
Dietterich TG, 1994, J ARTIF INTELL RES, V2, P263
[10]   Identification and Analysis of Cancer Diagnosis Using Probabilistic Classification Vector Machines with Feature Selection [J].
Du, Xiuquan ;
Li, Xinrui ;
Li, Wen ;
Yan, Yuanting ;
Zhang, Yanping .
CURRENT BIOINFORMATICS, 2018, 13 (06) :625-632