The analysis of genomics data needs to become as automated as its generation. Here we present a novel data-mining approach to predicting protein functional class from sequence. This method is based on a combination of inductive logic programming clustering and rule learning. We demonstrate the effectiveness of this approach on the M, tuberculosis and E. coli genomes, and identify biologically interpretable rules which predict protein functional class from information only available from the sequence. These rules predict 65% of the ORFs with no assigned function in M, tuberculosis and 24% of those in E, coli, with an estimated accuracy of 60-80% (depending on the level of functional assignment). The rules are founded on a combination of detection of remote homology, convergent evolution and horizontal gene transfer. We identify rules that predict protein functional class even in the absence of detectable sequence or structural homology, These rules give insight into the evolutionary history of M. tuberculosis and E, coli, Copyright (C) 2000 John Wiley & Sons, Ltd.
机构:
Nankai Univ, State Key Lab Med Chem Biol, Tianjin 300071, Peoples R China
Nankai Univ, Coll Life Sci, Tianjin 300071, Peoples R China
Tianjin Int Joint Acad Biotechnol & Med, Tianjin 300457, Peoples R ChinaNankai Univ, State Key Lab Med Chem Biol, Tianjin 300071, Peoples R China
Huo, Tong
Liu, Wei
论文数: 0引用数: 0
h-index: 0
机构:
Nankai Univ, State Key Lab Med Chem Biol, Tianjin 300071, Peoples R China
Nankai Univ, Coll Life Sci, Tianjin 300071, Peoples R China
Tianjin Int Joint Acad Biotechnol & Med, Tianjin 300457, Peoples R ChinaNankai Univ, State Key Lab Med Chem Biol, Tianjin 300071, Peoples R China
Liu, Wei
Guo, Yu
论文数: 0引用数: 0
h-index: 0
机构:
Nankai Univ, State Key Lab Med Chem Biol, Tianjin 300071, Peoples R China
Nankai Univ, Coll Pharm, Tianjin 300071, Peoples R China
Tianjin Int Joint Acad Biotechnol & Med, Tianjin 300457, Peoples R ChinaNankai Univ, State Key Lab Med Chem Biol, Tianjin 300071, Peoples R China
Guo, Yu
Yang, Cheng
论文数: 0引用数: 0
h-index: 0
机构:
Nankai Univ, State Key Lab Med Chem Biol, Tianjin 300071, Peoples R China
Nankai Univ, Coll Pharm, Tianjin 300071, Peoples R China
Tianjin Int Joint Acad Biotechnol & Med, Tianjin 300457, Peoples R ChinaNankai Univ, State Key Lab Med Chem Biol, Tianjin 300071, Peoples R China
Yang, Cheng
Lin, Jianping
论文数: 0引用数: 0
h-index: 0
机构:
Nankai Univ, State Key Lab Med Chem Biol, Tianjin 300071, Peoples R China
Nankai Univ, Coll Pharm, Tianjin 300071, Peoples R China
Tianjin Int Joint Acad Biotechnol & Med, Tianjin 300457, Peoples R ChinaNankai Univ, State Key Lab Med Chem Biol, Tianjin 300071, Peoples R China
Lin, Jianping
Rao, Zihe
论文数: 0引用数: 0
h-index: 0
机构:
Nankai Univ, State Key Lab Med Chem Biol, Tianjin 300071, Peoples R China
Nankai Univ, Coll Life Sci, Tianjin 300071, Peoples R China
Tianjin Int Joint Acad Biotechnol & Med, Tianjin 300457, Peoples R ChinaNankai Univ, State Key Lab Med Chem Biol, Tianjin 300071, Peoples R China
机构:
Univ Chicago, Ctr Translat Data Sci, Chicago, IL 60615 USAUniv Chicago, Ctr Translat Data Sci, Chicago, IL 60615 USA
Kuang, Xingyan
Wang, Fan
论文数: 0引用数: 0
h-index: 0
机构:
Univ Chicago, Ctr Translat Data Sci, Chicago, IL 60615 USAUniv Chicago, Ctr Translat Data Sci, Chicago, IL 60615 USA
Wang, Fan
Hernandez, Kyle M.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Chicago, Ctr Translat Data Sci, Chicago, IL 60615 USA
Univ Chicago, Dept Med, Chicago, IL 60637 USAUniv Chicago, Ctr Translat Data Sci, Chicago, IL 60615 USA
Hernandez, Kyle M.
Zhang, Zhenyu
论文数: 0引用数: 0
h-index: 0
机构:
Univ Chicago, Ctr Translat Data Sci, Chicago, IL 60615 USAUniv Chicago, Ctr Translat Data Sci, Chicago, IL 60615 USA
Zhang, Zhenyu
Grossman, Robert L.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Chicago, Ctr Translat Data Sci, Chicago, IL 60615 USA
Univ Chicago, Dept Med, Chicago, IL 60637 USAUniv Chicago, Ctr Translat Data Sci, Chicago, IL 60615 USA