Implementation and comparison of kernel-based learning methods to predict metabolic networks

被引:5
作者
Roche-Lima A. [1 ]
机构
[1] Collaboration Center for Research in Health Disparities, Medical Science Campus, University of Puerto Rico., PO Box 365067, San Juan, 00936-5067, PR
基金
美国国家卫生研究院; 加拿大自然科学与工程研究理事会;
关键词
Kernel methods; Machine learning; Metabolic pathways; Network prediction;
D O I
10.1007/s13721-016-0134-5
中图分类号
学科分类号
摘要
Metabolic pathways can be conceptualized as the biological equivalent of a data pipeline. In living cells, series of chemical reactions are carried out by different proteins called enzymes in a stepwise manner. However, many pathways remain incompletely characterized, and in some of them, not all enzyme components have been identified. Kernel methods are useful in many difficult problem areas, such as document classification and bioinformatics. Specifically, kernel methods have been used recently to predict biological networks, such as protein–protein interaction networks and metabolic networks. In this paper, we implement and compare different methods and types of data to predict metabolic networks. The methods are Penalized Kernel Matrix Regression (PKMR) and pairwise Support Vector Machine (pSVM). We develop several experiments using these methods with sequence, non-sequence, and combined data. We obtain better accuracy when the sequence data are used in both methods. Whereas when the methods are compared using the same type of data, the pSVM approach shows better accuracy. The best results are obtained with pSVM using all combined kernels. © 2016, The Author(s).
引用
收藏
相关论文
共 37 条
[11]  
Gomez S.M., Noble W.S., Rzhetsky A., Learning to predict protein–protein interactions from protein sequences, Bioinformatics, 19, pp. 1875-1881, (2003)
[12]  
Gribskov M., Robinson N.L., Use of receiver operating characteristic (ROC) analysis to evaluate sequence matching, Comput Chem, 20, pp. 25-33, (1996)
[13]  
Huang J.Y., Brutlag D.L., The emotif database, Nucleic Acids Res, 29, pp. 202-204, (2001)
[14]  
Kanehisa M., Araki M., Goto S., Hattori M., Hirakawa M., Itoh M., Katayama T., Kawashima S., Okuda S., Tokimatsu T., Et al., KEGG for linking genomes to life and the environment, Nucleic Acids Res, 36, pp. D480-D484, (2008)
[15]  
Karp P.D., Latendresse M., Caspi R., The pathway tools pathway prediction algorithm, Stand Genom Sci, 5, pp. 424-429, (2011)
[16]  
Kashima H., Oyama S., Yamanishi Y., Tsuda K., Cartesian kernel: an efficient alternative to the pairwise kernel, IEICE Trans Inf Syst, 93, pp. 2672-2679, (2010)
[17]  
Kohavi R., Et al., A study of cross-validation and bootstrap for accuracy estimation and model selection, IJCAI, 14, pp. 1137-1145, (1995)
[18]  
Kotera M., Yamanishi Y., Moriya Y., Kanehisa M., Goto S., GENIES: gene network inference engine based on supervised analysis, Nucleic Acids Res, 40, W1, pp. 162-167, (2012)
[19]  
Kotera M., Tabei Y., Yamanishi Y., Tokimatsu T., Goto S., Supervised reconstruction of metabolic pathways from metabolome-scale compound sets, Bioinformatics, 29, pp. i135-i144, (2013)
[20]  
Latendresse M., Paley S., Karp P.D., van Helden J., Browsing metabolic and regulatory networks with BioCyc, Bacterial molecular networks, pp. 197-216, (2012)