Integrating Somatic Mutations for Breast Cancer Survival Prediction Using Machine Learning Methods

被引:12
作者
He, Zongzhen [1 ]
Zhang, Junying [1 ]
Yuan, Xiguo [1 ]
Zhang, Yuanyuan [2 ]
机构
[1] Xidian Univ, Sch Comp Sci & Technol, Xian, Peoples R China
[2] Qingdao Univ Technol, Sch Informat & Control Engn, Qingdao, Peoples R China
关键词
breast cancer; multi-omics; survival prediction; somatic mutation; mRMR; MKL; EXPRESSION; PROGNOSIS;
D O I
10.3389/fgene.2020.632901
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Breast cancer is the most common malignancy in women, and because it has a high mortality rate, it is urgent to develop computational methods to increase the accuracy of breast cancer survival predictive models. Although multi-omics data such as gene expression have been extensively used in recent studies, the accurate prognosis of breast cancer remains a challenge. Somatic mutations are another important and promising data source for studying cancer development, and its effect on the prognosis of breast cancer remains to be further explored. Meanwhile, these omics datasets are high-dimensional and redundant. Therefore, we adopted multiple kernel learning (MKL) to efficiently integrate somatic mutation to currently molecular data including gene expression, copy number variation (CNV), methylation, and protein expression data for the prediction of breast cancer survival. Before integration, the maximum relevance minimum redundancy (mRMR) feature selection method was utilized to select features that present high relevance to survival and low redundancy among themselves for each type of data. The experimental results demonstrated that the proposed method achieved the most optimal performance and there was a remarkable improvement in the prediction performance when somatic mutations were included, indicating that somatic mutations are critical for improving breast cancer survival predictions. Moreover, mRMR was superior to other feature selection methods used in previous studies. Furthermore, MKL outperformed the other traditional classifiers in multi-omics data integration. Our analysis indicated that through employing promising omics data such as somatic mutations and harnessing the power of proper feature selection methods and effective integration frameworks, the breast cancer survival predictive accuracy can be further increased, thereby providing a more optimal clinical diagnosis and more effective treatment for breast cancer patients.
引用
收藏
页数:12
相关论文
共 55 条
[1]  
[Anonymous], 2010, GLOBAL BURDEN BREAST
[2]  
Arslanturk S, 2020, PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020, P551
[3]   The Somatic Genomic Landscape of Glioblastoma [J].
Brennan, Cameron W. ;
Verhaak, Roel G. W. ;
McKenna, Aaron ;
Campos, Benito ;
Noushmehr, Houtan ;
Salama, Sofie R. ;
Zheng, Siyuan ;
Chakravarty, Debyani ;
Sanborn, J. Zachary ;
Berman, Samuel H. ;
Beroukhim, Rameen ;
Bernard, Brady ;
Wu, Chang-Jiun ;
Genovese, Giannicola ;
Shmulevich, Ilya ;
Barnholtz-Sloan, Jill ;
Zou, Lihua ;
Vegesna, Rahulsimham ;
Shukla, Sachet A. ;
Ciriello, Giovanni ;
Yung, W. K. ;
Zhang, Wei ;
Sougnez, Carrie ;
Mikkelsen, Tom ;
Aldape, Kenneth ;
Bigner, Darell D. ;
Van Meir, Erwin G. ;
Prados, Michael ;
Sloan, Andrew ;
Black, Keith L. ;
Eschbacher, Jennifer ;
Finocchiaro, Gaetano ;
Friedman, William ;
Andrews, David W. ;
Guha, Abhijit ;
Iacocca, Mary ;
O'Neill, Brian P. ;
Foltz, Greg ;
Myers, Jerome ;
Weisenberger, Daniel J. ;
Penny, Robert ;
Kucherlapati, Raju ;
Perou, Charles M. ;
Hayes, D. Neil ;
Gibbs, Richard ;
Marra, Marco ;
Mills, Gordon B. ;
Lander, Eric ;
Spellman, Paul ;
Wilson, Richard .
CELL, 2013, 155 (02) :462-477
[4]  
Chen Q., 2019, P IEEE AC T COMP BIO
[5]   Classification of Cancer Primary Sites Using Machine Learning and Somatic Mutations [J].
Chen, Yukun ;
Sun, Jingchun ;
Huang, Liang-Chin ;
Xu, Hua ;
Zhao, Zhongming .
BIOMED RESEARCH INTERNATIONAL, 2015, 2015
[6]   Interferon regulatory factor 1 (IRF-1) and IRF-2 expression in breast cancer tissue microarrays [J].
Connett, JM ;
Badri, L ;
Giordano, TJ ;
Connett, WC ;
Doherty, GM .
JOURNAL OF INTERFERON AND CYTOKINE RESEARCH, 2005, 25 (10) :587-594
[7]  
Dey S., 1990, INTEGRATION CLIN GEN
[8]   Minimum redundancy feature selection from microarray gene expression data [J].
Ding, C ;
Peng, HC .
PROCEEDINGS OF THE 2003 IEEE BIOINFORMATICS CONFERENCE, 2003, :523-528
[9]   HMGB2 is associated with malignancy and regulates Warburg effect by targeting LDHB and FBP1 in breast cancer [J].
Fu, Deyuan ;
Li, Jing ;
Wei, Jinli ;
Zhang, Zhengquan ;
Luo, Yulin ;
Tan, Haosheng ;
Ren, Chuanli .
CELL COMMUNICATION AND SIGNALING, 2018, 16
[10]   Predicting the prognosis of breast cancer by integrating clinical and microarray data with Bayesian networks [J].
Gevaert, Olivier ;
De Smet, Frank ;
Timmerman, Dirk ;
Moreau, Yves ;
De Moor, Bart .
BIOINFORMATICS, 2006, 22 (14) :E184-E190