Comparative evaluation of network features for the prediction of breast cancer metastasis

被引:6
作者
Adnan, Nahim [1 ]
Liu, Zhijie [2 ]
Huang, Tim H. M. [2 ]
Ruan, Jianhua [1 ,2 ]
机构
[1] Univ Texas San Antonio, Dept Comp Sci, One UTSA Circle, San Antonio, TX 78249 USA
[2] Univ Texas Hlth Sci Ctr San Antonio, Dept Mol Med, 7703 Floyd Curl Dr, San Antonio, TX 78230 USA
关键词
Breast cancer metastasis; Metastasis prediction; Network features; Gene expression analysis;
D O I
10.1186/s12920-020-0676-3
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Background Discovering a highly accurate and robust gene signature for the prediction of breast cancer metastasis from gene expression profiling of primary tumors is one of the most challenging tasks to reduce the number of deaths in women. Due to the limited success of gene-based features in achieving satisfactory prediction accuracy, many methodologies have been proposed in recent years to develop network-based features by integrating network information with gene expression. However, evaluation results are inconsistent to confirm the effectiveness of network-based features, because of many confounding factors involved in classification model learning process, such as data normalization, dimension reduction, and feature selection. An unbiased comparative evaluation is essential for uncovering the strength of network-based features. Methods In this study, we compared several types of network-based features obtained using different mathematical operators (Mean, Maximum, Minimum, Median, Variance) on geneset (i.e., a gene and its' neighbors in the network) in protein-protein interaction network and gene co-expression network for their ability in predicting breast cancer metastasis using gene expression data from more than 10 patient cohorts. Results While network-based features are usually statistically more significant than gene-based feature, a consistent improvement of prediction performance using network-based features requires a substantial number of patients in the dataset. In contrary to many previous reports, no evidence was found to support the robustness of network-based features and we argue some of the robustness may be due to the inherent bias associated with node degree in the network. In addition, different types of network features seem to cover different pathways and are complementary to each other. Consequently, an ensemble classifier combining different network features was proposed and was found to significantly outperform classifiers based on gene-based feature or any single type of network-based features. Conclusions Network-based features and their combination show promise for improving the prediction of breast cancer metastasis but may require a large amount of training data. Robustness claim of network-based features needs to be re-examined with network node degree and other confounding factors in consideration.
引用
收藏
页数:10
相关论文
共 25 条
[1]   Prediction of breast cancer prognosis using gene set statistics provides signature stability and biological context [J].
Abraham, Gad ;
Kowalczyk, Adam ;
Loi, Sherene ;
Haviv, Izhak ;
Zobel, Justin .
BMC BIOINFORMATICS, 2010, 11
[2]   De novo pathway-based biomarker identification [J].
Alcaraz, Nicolas ;
List, Markus ;
Batra, Richa ;
Vandin, Fabio ;
Ditzel, Henrik J. ;
Baumbach, Jan .
NUCLEIC ACIDS RESEARCH, 2017, 45 (16)
[3]   FERAL: network-based classifier with application to breast cancer outcome prediction [J].
Allahyar, Amin ;
de Ridder, Jeroen .
BIOINFORMATICS, 2015, 31 (12) :311-319
[4]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[5]   Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees [J].
Chou, Hsiu-Ling ;
Yao, Chung-Tay ;
Su, Sui-Lun ;
Lee, Chia-Yi ;
Hu, Kuang-Yu ;
Terng, Harn-Jing ;
Shih, Yun-Wen ;
Chang, Yu-Tien ;
Lu, Yu-Fen ;
Chang, Chi-Wen ;
Wahlqvist, Mark L. ;
Wetter, Thomas ;
Chu, Chi-Ming .
BMC BIOINFORMATICS, 2013, 14
[6]   Network-based classification of breast cancer metastasis [J].
Chuang, Han-Yu ;
Lee, Eunjung ;
Liu, Yu-Tsueng ;
Lee, Doheon ;
Ideker, Trey .
MOLECULAR SYSTEMS BIOLOGY, 2007, 3 (1)
[7]   Inferring cancer subnetwork markers using density-constrained biclustering [J].
Dao, Phuong ;
Colak, Recep ;
Salari, Raheleh ;
Moser, Flavia ;
Davicioni, Elai ;
Schoenhuth, Alexander ;
Ester, Martin .
BIOINFORMATICS, 2010, 26 (18) :i625-i631
[8]   Outcome signature genes in breast cancer: is there a unique set? [J].
Ein-Dor, L ;
Kela, I ;
Getz, G ;
Givol, D ;
Domany, E .
BIOINFORMATICS, 2005, 21 (02) :171-178
[9]   A Steiner tree-based method for biomarker discovery and classification in breast cancer metastasis [J].
Jahid, Md Jamiul ;
Ruan, Jianhua .
BMC GENOMICS, 2012, 13
[10]   Inferring Pathway Activity toward Precise Disease Classification [J].
Lee, Eunjung ;
Chuang, Han-Yu ;
Kim, Jong-Won ;
Ideker, Trey ;
Lee, Doheon .
PLOS COMPUTATIONAL BIOLOGY, 2008, 4 (11)