Similarity/dissimilarity calculation methods of DNA sequences: A survey

被引:22
作者
Jin, Xin [1 ]
Jiang, Qian [1 ]
Chen, Yanyan [2 ]
Lee, Shin-Jye [3 ,4 ]
Nie, Rencan [1 ]
Yao, Shaowen [3 ]
Zhou, Dongming [1 ]
He, Kangjian [1 ]
机构
[1] Yunnan Univ, Sch Informat, Kunming, Yunnan, Peoples R China
[2] Yunnan Univ, Sch Life Sci, Kunming, Yunnan, Peoples R China
[3] Yunnan Univ, Sch Software, Kunming, Yunnan, Peoples R China
[4] Univ Cambridge, Queens Coll, Cambridge CB3 9ET, England
基金
中国国家自然科学基金;
关键词
DNA sequence analysis; Similarity analysis; Graphical representation; Evolutionary relationship; Feature extraction; 2D GRAPHICAL REPRESENTATION; CHAOS GAME REPRESENTATION; SIMILARITY ANALYSIS; CURVE; MODEL; SIMILARITIES/DISSIMILARITIES;
D O I
10.1016/j.jmgm.2017.07.019
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
DNA sequence similarity/dissimilarity analysis is a fundamental task in computational biology, which is used to analyze the similarity of different DNA sequences for learning their evolutionary relationships. In past decades, a large number of similarity analysis methods for DNA sequence have been proposed due to the ever-growing demands. In order to learn the advances of DNA sequence similarity analysis, we make a survey and try to promote the development of this field. In this paper, we first introduce the related knowledge of DNA similarities analysis, including the data sets, similarities distance and output data. Then, we review recent algorithmic developments for DNA similarity analysis to represent a survey of the art in this field. At last, we summarize the corresponding tendencies and challenges in this research field. This survey concludes that although various DNA similarity analysis methods have been proposed, there still exist several further improvements or potential research directions in this field. (C) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页码:342 / 355
页数:14
相关论文
共 59 条
[1]   DNA sequencing using optical joint Fourier transform [J].
Alqallaf, A. K. ;
Cherri, A. K. .
OPTIK, 2016, 127 (04) :1929-1936
[2]   A representation of DNA primary sequences by random walk [J].
Bai, Feng-lan ;
Liu, Ying-zhao ;
Wang, Tian-ming .
MATHEMATICAL BIOSCIENCES, 2007, 209 (01) :282-291
[3]   Similarity analysis of DNA sequences based on the EMD method [J].
Bai, Fenglan ;
Zhang, Jihong ;
Zheng, Junsheng .
APPLIED MATHEMATICS LETTERS, 2011, 24 (02) :232-237
[4]   An improved alignment-free model for dna sequence similarity metric [J].
Bao, Junpeng ;
Yuan, Ruiyu ;
Bao, Zhe .
BMC BIOINFORMATICS, 2014, 15
[5]   A novel 2D graphical representation of DNA sequences and its application [J].
Dai, Qi ;
Liu, Xiaoqing ;
Wang, Tianming .
JOURNAL OF MOLECULAR GRAPHICS & MODELLING, 2006, 25 (03) :340-344
[6]   A new method to analyze the similarity of the DNA sequences [J].
Guo, Ying ;
Wang, Tian-Ming .
JOURNAL OF MOLECULAR STRUCTURE-THEOCHEM, 2008, 853 (1-3) :62-67
[7]  
HAMORI E, 1983, J BIOL CHEM, V258, P1318
[8]   Characteristic sequences for DNA primary sequence [J].
He, PA ;
Wang, J .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2002, 42 (05) :1080-1085
[9]   Numerical encoding of DNA sequences by chaos game representation with application in similarity comparison [J].
Hoang, Tung ;
Yin, Changchuan ;
Yau, Stephen S. -T. .
GENOMICS, 2016, 108 (3-4) :134-142
[10]   A novel representation of DNA sequence based on CMI coding [J].
Hou, Wenbing ;
Pan, Qiuhui ;
He, Mingfeng .
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2014, 409 :87-96