Genomic Signal Processing Methods for Computation of Alignment-Free Distances from DNA Sequences

被引:23
作者
Borrayo, Ernesto [1 ]
Gerardo Mendizabal-Ruiz, E. [1 ]
Velez-Perez, Hugo [1 ]
Romo-Vazquez, Rebeca [1 ]
Mendizabal, Adriana P. [2 ]
Alejandro Morales, J. [1 ,3 ]
机构
[1] Univ Guadalajara, CUCEI, Dept Comp Sci, Guadalajara 44430, Jalisco, Mexico
[2] Univ Guadalajara, CUCEI, Mol Biol Lab, Farmacobiol Dept, Guadalajara 44430, Jalisco, Mexico
[3] Univ Guadalajara, Ctr Theoret Res & High Performance Comp, CUCEI, Guadalajara 44430, Jalisco, Mexico
来源
PLOS ONE | 2014年 / 9卷 / 11期
关键词
SIMILARITY; IDENTIFICATION; PREDICTION; SEARCH;
D O I
10.1371/journal.pone.0110954
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments.
引用
收藏
页数:13
相关论文
共 53 条
[21]   MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform [J].
Katoh, K ;
Misawa, K ;
Kuma, K ;
Miyata, T .
NUCLEIC ACIDS RESEARCH, 2002, 30 (14) :3059-3066
[22]   Gene prediction by spectral rotation measure: A new method for identifying protein-coding regions [J].
Kotlar, D ;
Lavner, Y .
GENOME RESEARCH, 2003, 13 (08) :1930-1937
[23]   Analysis of similarity/dis similarity of DNA sequences based on 3-D graphical representation [J].
Liao, B ;
Wang, TM .
CHEMICAL PHYSICS LETTERS, 2004, 388 (1-3) :195-200
[24]   RAPID AND SENSITIVE PROTEIN SIMILARITY SEARCHES [J].
LIPMAN, DJ ;
PEARSON, WR .
SCIENCE, 1985, 227 (4693) :1435-1441
[25]   Digital Signal Processing in the Analysis of Genomic Sequences [J].
Lorenzo-Ginori, Juan V. ;
Rodriguez-Fuentes, Anibal ;
Grau Abalo, Ricardo ;
Sanchez Rodriguez, Robersy .
CURRENT BIOINFORMATICS, 2009, 4 (01) :28-40
[26]   Gene Prediction Based on DNA Spectral Analysis: A Literature Review [J].
Marhon, Sajid A. ;
Kremer, Stefan C. .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2011, 18 (04) :639-676
[27]  
Nair AS, 2006, BIOINFORMATION, V1, P197
[28]   A GENERAL METHOD APPLICABLE TO SEARCH FOR SIMILARITIES IN AMINO ACID SEQUENCE OF 2 PROTEINS [J].
NEEDLEMAN, SB ;
WUNSCH, CD .
JOURNAL OF MOLECULAR BIOLOGY, 1970, 48 (03) :443-+
[29]   Comparison of different melting temperature calculation methods for short DNA sequences [J].
Panjkovich, A ;
Melo, F .
BIOINFORMATICS, 2005, 21 (06) :711-722
[30]   Numerical Characterization of DNA Sequence Based on Dinucleotides [J].
Qi, Xingqin ;
Fuller, Edgar ;
Wu, Qin ;
Zhang, Cun-Quan .
SCIENTIFIC WORLD JOURNAL, 2012,