A nearest neighbor approach for automated transporter prediction and categorization from protein sequences

被引:24
作者
Li, Haiquan [1 ]
Dai, Xinbin [1 ]
Zhao, Xuechun [1 ]
机构
[1] Samuel Roberts Noble Fdn Inc, Div Plant Biol, Bioinformat Lab, Ardmore, OK 73401 USA
关键词
D O I
10.1093/bioinformatics/btn099
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Membrane transport proteins play a crucial role in the import and export of ions, small molecules or macromolecules across biological membranes. Currently, there are a limited number of published computational tools which enable the systematic discovery and categorization of transporters prior to costly experimental validation. To approach this problem, we utilized a nearest neighbor method which seamlessly integrates homologous search and topological analysis into a machine-learning framework. Results: Our approach satisfactorily distinguished 484 transporter families in the Transporter Classification Database, a curated and representative database for transporters. A five-fold cross-validation on the database achieved a positive classification rate of 72.3 on average. Furthermore, this method successfully detected transporters in seven model and four non-model organisms, ranging from archaean to mammalian species. A preliminary literature-based validation has cross-validated 65.8 of our predictions on the 11 organisms, including 55.9 of our predictions overlapping with 83.6 of the predicted transporters in TransportDB. Availability and Supplementary information: http://www.w3.org/1999/xlink">http://bioinfo.noble.org/manuscript-support/transporter/ Contact: pzhao@noble.org.
引用
收藏
页码:1129 / 1136
页数:8
相关论文
共 37 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]  
[Anonymous], 1993, Biol. Chem. Hoppe Seyler, DOI DOI 10.1515/BCHM3.1993.374.1-6.143
[3]  
Apweiler R, 2001, Brief Bioinform, V2, P9, DOI 10.1093/bib/2.1.9
[4]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[5]   Variations on probabilistic suffix trees: statistical modeling and prediction of protein families [J].
Bejerano, G ;
Yona, G .
BIOINFORMATICS, 2001, 17 (01) :23-43
[6]   The Transporter Classification (TC) system, 2002 [J].
Busch, W ;
Saier, MH .
CRITICAL REVIEWS IN BIOCHEMISTRY AND MOLECULAR BIOLOGY, 2002, 37 (05) :287-337
[7]   Phylogeny as a guide to structure and function of membrane transport proteins (Review) [J].
Chang, AB ;
Lin, R ;
Studley, WK ;
Tran, CV ;
Saier, MH .
MOLECULAR MEMBRANE BIOLOGY, 2004, 21 (03) :171-181
[8]   NEAREST NEIGHBOR PATTERN CLASSIFICATION [J].
COVER, TM ;
HART, PE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) :21-+
[9]   Comparative molecular analysis of Na+/H+ exchangers:: a unified model for Na+/H+ antiport? [J].
Dibrov, P ;
Fliegel, L .
FEBS LETTERS, 1998, 424 (1-2) :1-5
[10]   SIMILAR AMINO-ACID-SEQUENCES - CHANCE OR COMMON ANCESTRY [J].
DOOLITTLE, RF .
SCIENCE, 1981, 214 (4517) :149-159