Exploring the Adenylation Domain Repertoire of Nonribosomal Peptide Synthetases Using an Ensemble of Sequence-Search Methods

被引:7
作者
Agueero-Chapin, Guillermin [1 ,2 ,3 ]
Molina-Ruiz, Reinaldo [2 ]
Maldonado, Emanuel [1 ]
de la Riva, Gustavo [4 ]
Sanchez-Rodriguez, Aminael [5 ]
Vasconcelos, Vitor [1 ,3 ]
Antunes, Agostinho [1 ,3 ]
机构
[1] Univ Porto, CIMAR CIIMAR, Ctr Interdisciplinar Invest Marinha & Ambiental, P-4100 Porto, Portugal
[2] Univ Cent Marta Abreu Las Villas UCLV, Mol Simulat & Drug Design CBQ, Santa Clara, Cuba
[3] Univ Porto, Dept Biol, Fac Ciencias, P-4100 Porto, Portugal
[4] Inst Tecnol Super Irapuato ITESI, Dept Biol, Guanajuato, Mexico
[5] Katholieke Univ Leuven, Dept Microbial & Mol Syst, CMPG, Louvain, Belgium
关键词
AMINO-ACID-COMPOSITION; NEURAL-NETWORK MODEL; IN-SILICO; TOPOLOGICAL INDEXES; GRAPHICAL REPRESENTATION; ADJACENCY MATRIX; EFFICIENT SEARCH; PREDICTION; ALIGNMENT; DNA;
D O I
10.1371/journal.pone.0065926
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The introduction of two-dimension (2D) graphs and their numerical characterization for comparative analyses of DNA/RNA and protein sequences without the need of sequence alignments is an active yet recent research topic in bioinformatics. Here, we used a 2D artificial representation (four-color maps) with a simple numerical characterization through topological indices (TIs) to aid the discovering of remote homologous of Adenylation domains (A-domains) from the Nonribosomal Peptide Synthetases (NRPS) class in the proteome of the cyanobacteria Microcystis aeruginosa. Cyanobacteria are a rich source of structurally diverse oligopeptides that are predominantly synthesized by NPRS. Several A-domains share amino acid identities lower than 20 % being a possible source of remote homologous. Therefore, A-domains cannot be easily retrieved by BLASTp searches using a single template. To cope with the sequence diversity of the A-domains we have combined homology-search methods with an alignment-free tool that uses protein four-color-maps. TI2BioP (Topological Indices to BioPolymers) version 2.0, available at http://ti2biop.sourceforge.net/ allowed the calculation of simple TIs from the protein sequences (four-color maps). Such TIs were used as input predictors for the statistical estimations required to build the alignment-free models. We concluded that the use of graphical/numerical approaches in cooperation with other sequence search methods, like multi-templates BLASTp and profile HMM, can give the most complete exploration of the repertoire of highly diverse protein families.
引用
收藏
页数:13
相关论文
共 55 条
[1]   An Alignment-Free Approach for Eukaryotic ITS2 Annotation and Phylogenetic Inference [J].
Agueero-Chapin, Guillermin ;
Sanchez-Rodriguez, Aminael ;
Hidalgo-Yanes, Pedro I. ;
Perez-Castillo, Yunierkis ;
Molina-Ruiz, Reinaldo ;
Marchal, Kathleen ;
Vasconcelos, Vitor ;
Antunes, Agostinho .
PLOS ONE, 2011, 6 (10)
[2]   Non-linear models based on simple topological indices to identify RNase III protein members [J].
Agueero-Chapin, Guillermin ;
de la Riva, Gustavo A. ;
Molina-Ruiz, Reinaldo ;
Sanchez-Rodriguez, Aminael ;
Perez-Machado, Gisselle ;
Vasconcelos, Vitor ;
Antunes, Agostinho .
JOURNAL OF THEORETICAL BIOLOGY, 2011, 273 (01) :167-178
[3]   TI2BioP: Topological Indices to BioPolymers. Its practical use to unravel cryptic bacteriocin-like domains [J].
Agueero-Chapin, Guillermin ;
Perez-Machado, Gisselle ;
Molina-Ruiz, Reinaldo ;
Perez-Castillo, Yunierkis ;
Morales-Helguera, Aliuska ;
Vasconcelos, Vitor ;
Antunes, Agostinho .
AMINO ACIDS, 2011, 40 (02) :431-442
[4]   Alignment-Free Prediction of Polygalacturonases with Pseudofolding Topological Indices: Experimental Isolation from Coffea arabica and Prediction of a New Sequence [J].
Agueero-Chapin, Guillermin ;
Varona-Santos, Javier ;
de la Riva, Gustavo A. ;
Antunes, Agostinho ;
Gonzalez-Villa, Tomas ;
Uriarte, Eugenio ;
Gonzalez-Diaz, Humberto .
JOURNAL OF PROTEOME RESEARCH, 2009, 8 (04) :2122-2128
[5]   Novel 2D maps and coupling numbers for protein sequences.: The first QSAR study of polygalacturonases;: isolation and prediction of a novel sequence from Psidium guajava']java L. [J].
Agüero-Chapin, GA ;
González-Díaz, H ;
Molina, R ;
Varona-Santos, J ;
Uriarte, E ;
González-Díaz, Y .
FEBS LETTERS, 2006, 580 (03) :723-730
[6]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[7]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[8]   In silico analysis of methyltransferase domains involved in biosynthesis of secondary metabolites [J].
Ansari, Mohd Zeeshan ;
Sharma, Jyoti ;
Gokhale, Rajesh S. ;
Mohanty, Debasisa .
BMC BIOINFORMATICS, 2008, 9 (1)
[9]   NRPS-PKS: a knowledge-based resource for analysis of NRPS/PKS megasynthases [J].
Ansari, MZ ;
Yadav, G ;
Gokhale, RS ;
Mohanty, D .
NUCLEIC ACIDS RESEARCH, 2004, 32 :W405-W413
[10]   Identification of homologs in insignificant blast hits by exploiting extrinsic gene properties [J].
Boekhorst, Jos ;
Snel, Berend .
BMC BIOINFORMATICS, 2007, 8 (1)