Graph Theory-Based Sequence Descriptors as Remote Homology Predictors

被引:11
作者
Agueero-Chapin, Guillermin [1 ,2 ]
Galpert, Deborah [3 ]
Molina-Ruiz, Reinaldo [4 ]
Ancede-Gallardo, Evys [5 ]
Perez-Machado, Gisselle [6 ]
De la Riva, Gustavo A. [7 ,8 ]
Antunes, Agostinho [1 ,2 ]
机构
[1] Univ Porto, Interdisciplinary Ctr Marine & Environm Res, CIIMAR CIMAR, Terminal Cruzeiros Porto Leixoes, Av Gen Norton Matos S-N, P-4450208 Porto, Portugal
[2] Univ Porto, Dept Biol, Fac Sci, Rua Campo Alegre, P-4169007 Porto, Portugal
[3] Univ Cent Marta Abreu Las Villas UCLV, Dept Ciencia Comp, Santa Clara 54830, Cuba
[4] Univ Cent Marta Abreu Las Villas UCLV, Ctr Bioact Quim CBQ, Santa Clara 54830, Cuba
[5] Univ Andres Bello, Fac Ciencias Exactas, Programa Doctorado Fis Quim Mol, Av Republ 239, Santiago 8370146, Chile
[6] EpiDisease SL Spin Off, Ctr Invest Biomed Red Enfermedades Raras CIBERER, Valencia 46980, Spain
[7] GRECA Inc, Lab Biotecnol Aplicada S RL CV, Carretera Piedad Carapan,Km 3-5, La Piedad 59300, Michoacan, Mexico
[8] Tecnol Nacl Mexico, Inst Tecnol Piedad, Av Ricardo Guzman Romero, La Piedad De Cavadas 59370, Michoacan, Mexico
关键词
QSAR; topological indices; alignment-free; bioinformatics; big data; AMINO-ACID-COMPOSITION; COUPLED RECEPTOR CLASSES; GENETIC NEURAL-NETWORKS; ATOM ADJACENCY MATRIX; FLEXIBLE WEB SERVER; IN-SILICO DESIGN; TOPOLOGICAL INDEXES; PROTEIN STABILITY; DNA-SEQUENCES; CONFORMATIONAL STABILITY;
D O I
10.3390/biom10010026
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Alignment-free (AF) methodologies have increased in popularity in the last decades as alternative tools to alignment-based (AB) algorithms for performing comparative sequence analyses. They have been especially useful to detect remote homologs within the twilight zone of highly diverse gene/protein families and superfamilies. The most popular alignment-free methodologies, as well as their applications to classification problems, have been described in previous reviews. Despite a new set of graph theory-derived sequence/structural descriptors that have been gaining relevance in the detection of remote homology, they have been omitted as AF predictors when the topic is addressed. Here, we first go over the most popular AF approaches used for detecting homology signals within the twilight zone and then bring out the state-of-the-art tools encoding graph theory-derived sequence/structure descriptors and their success for identifying remote homologs. We also highlight the tendency of integrating AF features/measures with the AB ones, either into the same prediction model or by assembling the predictions from different algorithms using voting/weighting strategies, for improving the detection of remote signals. Lastly, we briefly discuss the efforts made to scale up AB and AF features/measures for the comparison of multiple genomes and proteomes. Alongside the achieved experiences in remote homology detection by both the most popular AF tools and other less known ones, we provide our own using the graphical-numerical methodologies, MARCH-INSIDE, TI2BioP, and ProtDCal. We also present a new Python-based tool (SeqDivA) with a friendly graphical user interface (GUI) for delimiting the twilight zone by using several similar criteria.
引用
收藏
页数:32
相关论文
共 180 条
[1]   MMM-QSAR recognition of ribonucleases without alignment:: Comparison with an HMM model and isolation from Schizosaccharomyces pombe, prediction, and experimental assay of a new sequence [J].
Agueero-Chapin, Guillemin ;
Gonzalez-Diaz, Humberto ;
de la Riva, Gustavo ;
Rodriguez, Edrey ;
Sanchez-Rodriguez, Aminael ;
Podda, Gianni ;
Vazquez-Padron, Roberto I. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2008, 48 (02) :434-448
[2]   Exploring the Adenylation Domain Repertoire of Nonribosomal Peptide Synthetases Using an Ensemble of Sequence-Search Methods [J].
Agueero-Chapin, Guillermin ;
Molina-Ruiz, Reinaldo ;
Maldonado, Emanuel ;
de la Riva, Gustavo ;
Sanchez-Rodriguez, Aminael ;
Vasconcelos, Vitor ;
Antunes, Agostinho .
PLOS ONE, 2013, 8 (07)
[3]   An Alignment-Free Approach for Eukaryotic ITS2 Annotation and Phylogenetic Inference [J].
Agueero-Chapin, Guillermin ;
Sanchez-Rodriguez, Aminael ;
Hidalgo-Yanes, Pedro I. ;
Perez-Castillo, Yunierkis ;
Molina-Ruiz, Reinaldo ;
Marchal, Kathleen ;
Vasconcelos, Vitor ;
Antunes, Agostinho .
PLOS ONE, 2011, 6 (10)
[4]   Non-linear models based on simple topological indices to identify RNase III protein members [J].
Agueero-Chapin, Guillermin ;
de la Riva, Gustavo A. ;
Molina-Ruiz, Reinaldo ;
Sanchez-Rodriguez, Aminael ;
Perez-Machado, Gisselle ;
Vasconcelos, Vitor ;
Antunes, Agostinho .
JOURNAL OF THEORETICAL BIOLOGY, 2011, 273 (01) :167-178
[5]   TI2BioP: Topological Indices to BioPolymers. Its practical use to unravel cryptic bacteriocin-like domains [J].
Agueero-Chapin, Guillermin ;
Perez-Machado, Gisselle ;
Molina-Ruiz, Reinaldo ;
Perez-Castillo, Yunierkis ;
Morales-Helguera, Aliuska ;
Vasconcelos, Vitor ;
Antunes, Agostinho .
AMINO ACIDS, 2011, 40 (02) :431-442
[6]   Alignment-Free Prediction of Polygalacturonases with Pseudofolding Topological Indices: Experimental Isolation from Coffea arabica and Prediction of a New Sequence [J].
Agueero-Chapin, Guillermin ;
Varona-Santos, Javier ;
de la Riva, Gustavo A. ;
Antunes, Agostinho ;
Gonzalez-Villa, Tomas ;
Uriarte, Eugenio ;
Gonzalez-Diaz, Humberto .
JOURNAL OF PROTEOME RESEARCH, 2009, 8 (04) :2122-2128
[7]   Novel 2D maps and coupling numbers for protein sequences.: The first QSAR study of polygalacturonases;: isolation and prediction of a novel sequence from Psidium guajava']java L. [J].
Agüero-Chapin, GA ;
González-Díaz, H ;
Molina, R ;
Varona-Santos, J ;
Uriarte, E ;
González-Díaz, Y .
FEBS LETTERS, 2006, 580 (03) :723-730
[8]   Comparative Study of Topological Indices of Macro/Supramolecular RNA Complex Networks [J].
Aguero-Chapin, Guillermin ;
Antunes, Agostinho ;
Ubeira, Florencio M. ;
Chou, Kuo-Chen ;
Gonzalez-Diaz, Humberto .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2008, 48 (11) :2265-2277
[9]   Naive Bayes QSDR classification based on spiral-graph Shannon entropies for protein biomarkers in human colon cancer [J].
Aguiar-Pulido, Vanessa ;
Munteanu, Cristian R. ;
Seoane, Jose A. ;
Fernandez-Blanco, Enrique ;
Perez-Montoto, Lazaro G. ;
Gonzalez-Diaz, Humberto ;
Dorado, Julian .
MOLECULAR BIOSYSTEMS, 2012, 8 (06) :1716-1722
[10]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410