Predicting the network of substrate-enzyme-product triads by combining compound similarity and functional domain composition

被引:86
作者
Chen, Lei [2 ,3 ]
Feng, Kai-Yan [4 ]
Cai, Yu-Dong [1 ,5 ]
Chou, Kuo-Chen [5 ]
Li, Hai-Peng [6 ]
机构
[1] Shanghai Univ, Inst Syst Biol, Shanghai 200444, Peoples R China
[2] Shanghai Maritime Univ, Coll Informat Engn, Shanghai 201306, Peoples R China
[3] Fudan Univ, Ctr Computat Syst Biol, Shanghai 200433, Peoples R China
[4] Univ Manchester, Sch Med, Div Imaging Sci, Manchester M13 9PT, Lancs, England
[5] Gordon Life Sci Inst, San Diego, CA 92130 USA
[6] Chinese Acad Sci, Shanghai Inst Biol Sci, CAS MPG Partner Inst Computat Biol, Shanghai 200031, Peoples R China
来源
BMC BIOINFORMATICS | 2010年 / 11卷
关键词
SUPPORT VECTOR MACHINES; PROTEIN SUBCELLULAR LOCATION; AMINO-ACID-COMPOSITION; GRAPHICAL RULES; QUATERNARY STRUCTURE; SECONDARY STRUCTURE; WEB SERVER; STEADY; CLASSIFICATION; SYSTEM;
D O I
10.1186/1471-2105-11-293
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Metabolic pathway is a highly regulated network consisting of many metabolic reactions involving substrates, enzymes, and products, where substrates can be transformed into products with particular catalytic enzymes. Since experimental determination of the network of substrate-enzyme-product triad (whether the substrate can be transformed into the product with a given enzyme) is both time-consuming and expensive, it would be very useful to develop a computational approach for predicting the network of substrate-enzyme-product triads. Results: A mathematical model for predicting the network of substrate-enzyme-product triads was developed. Meanwhile, a benchmark dataset was constructed that contains 744,192 substrate-enzyme-product triads, of which 14,592 are networking triads, and 729,600 are non-networking triads; i.e., the number of the negative triads was about 50 times the number of the positive triads. The molecular graph was introduced to calculate the similarity between the substrate compounds and between the product compounds, while the functional domain composition was introduced to calculate the similarity between enzyme molecules. The nearest neighbour algorithm was utilized as a prediction engine, in which a novel metric was introduced to measure the "nearness" between triads. To train and test the prediction engine, one tenth of the positive triads and one tenth of the negative triads were randomly picked from the benchmark dataset as the testing samples, while the remaining were used to train the prediction model. It was observed that the overall success rate in predicting the network for the testing samples was 98.71%, with 95.41% success rate for the 1,460 testing networking triads and 98.77% for the 72,960 testing non-networking triads. Conclusions: It is quite promising and encouraged to use the molecular graph to calculate the similarity between compounds and use the functional domain composition to calculate the similarity between enzymes for studying the substrate-enzyme-product network system. The software is available upon request.
引用
收藏
页数:11
相关论文
共 60 条
[1]   Quantitative Comparison of Catalytic Mechanisms and Overall Reactions in Convergently Evolved Enzymes: Implications for Classification of Enzyme Function [J].
Almonacid, Daniel E. ;
Yera, Emmanuel R. ;
Mitchell, John B. O. ;
Babbitt, Patricia C. .
PLOS COMPUTATIONAL BIOLOGY, 2010, 6 (03)
[2]  
ALTHAUS IW, 1993, J BIOL CHEM, V268, P14875
[3]  
ALTHAUS IW, 1993, J BIOL CHEM, V268, P6119
[4]   KINETIC-STUDIES WITH THE NONNUCLEOSIDE HIV-1 REVERSE-TRANSCRIPTASE INHIBITOR-U-88204E [J].
ALTHAUS, IW ;
CHOU, JJ ;
GONZALES, AJ ;
DEIBEL, MR ;
CHOU, KC ;
KEZDY, FJ ;
ROMERO, DL ;
PALMER, JR ;
THOMAS, RC ;
ARISTOFF, PA ;
TARPLEY, WG ;
REUSSER, F .
BIOCHEMISTRY, 1993, 32 (26) :6548-6554
[5]   Kinetic plasticity and the determination of product ratios for kinetic schemes leading to multiple products without rate laws - New methods based on directed graphs [J].
Andraos, John .
CANADIAN JOURNAL OF CHEMISTRY, 2008, 86 (04) :342-357
[6]  
[Anonymous], NAT SCI
[7]   Assessing the accuracy of prediction algorithms for classification: an overview [J].
Baldi, P ;
Brunak, S ;
Chauvin, Y ;
Andersen, CAF ;
Nielsen, H .
BIOINFORMATICS, 2000, 16 (05) :412-424
[8]   Predicting enzyme subclass by functional domain composition and pseudo amino acid composition [J].
Cai, YD ;
Chou, KC .
JOURNAL OF PROTEOME RESEARCH, 2005, 4 (03) :967-971
[9]   Using functional domain composition to predict enzyme family classes [J].
Cai, YD ;
Chou, KC .
JOURNAL OF PROTEOME RESEARCH, 2005, 4 (01) :109-111
[10]   Nearest neighbour algorithm for predicting protein subcellular location by combining functional domain composition and pseudo-amino acid composition [J].
Cai, YD ;
Chou, KC .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2003, 305 (02) :407-411