Sorting by Genome Rearrangements on Both Gene Order and Intergenic Sizes

被引:15
作者
Brito, Klairton Lima [1 ]
Jean, Geraldine [2 ]
Fertin, Guillaume [2 ]
Oliveira, Andre Rodrigues [1 ]
Dias, Ulisses [3 ]
Dias, Zanoni [1 ]
机构
[1] Univ Estadual Campinas, Inst Comp, 1251 Albert Einstein Ave, BR-13083852 Campinas, SP, Brazil
[2] Univ Nantes, LS2N UMR CNRS 6004, Nantes, France
[3] Univ Estadual Campinas, Sch Technol, Limeira, Brazil
基金
巴西圣保罗研究基金会;
关键词
approximation algorithms; genome rearrangements; intergenic regions; weighted operations; REVERSALS; TRANSPOSITIONS; PERMUTATIONS; ALGORITHM;
D O I
10.1089/cmb.2019.0293
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
During the evolutionary process, genomes are affected by various genome rearrangements, that is, events that modify large stretches of the genetic material. In the literature, a large number of models have been proposed to estimate the number of events that occurred during evolution; most of them represent a genome as an ordered sequence of genes, and, in particular, disregard the genetic material between consecutive genes. However, recent studies showed that taking into account the genetic material between consecutive genes can enhance evolutionary distance estimations. Reversal and transposition are genome rearrangements that have been widely studied in the literature. A reversal inverts a (contiguous) segment of the genome, while a transposition swaps the positions of two consecutive segments. Genomes also undergo nonconservative events (events that alter the amount of genetic material) such as insertions and deletions, in which genetic material from intergenic regions of the genome is inserted or deleted, respectively. In this article, we study a genome rearrangement model that considers both gene order and sizes of intergenic regions. We investigate the reversal distance, and also the reversal and transposition distance between two genomes in two scenarios: with and without nonconservative events. We show that these problems are NP-hard and we present constant ratio approximation algorithms for all of them. More precisely, we provide a 4-approximation algorithm for the reversal distance, both in the conservative and nonconservative versions. For the reversal and transposition distance, we provide a 4.5-approximation algorithm, both in the conservative and nonconservative versions. We also perform experimental tests to verify the behavior of our algorithms, as well as to compare the practical and theoretical results. We finally extend our study to scenarios in which events have different costs, and we present constant ratio approximation algorithms for each scenario.
引用
收藏
页码:156 / 174
页数:19
相关论文
共 17 条
[1]  
Alexandrino A.O., 2018, LECT NOTES COMPUTER, V10849, P53
[2]   Sorting by weighted reversals, transpositions, and inverted transpositions [J].
Bader, Martin ;
Ohlebusch, Enno .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2007, 14 (05) :615-636
[3]  
Berman P., 2002, LECT NOTES COMPUTER, P200
[4]   Comparative Genomics on Artificial Life [J].
Biller, Priscila ;
Knibbe, Carole ;
Beslon, Guillaume ;
Tannier, Eric .
PURSUIT OF THE UNIVERSAL, 2016, 9709 :35-44
[5]   Breaking Good: Accounting for Fragility of Genomic Regions in Rearrangement Distance Estimation [J].
Biller, Priscila ;
Gueguen, Laurent ;
Knibbe, Carole ;
Tannier, Eric .
GENOME BIOLOGY AND EVOLUTION, 2016, 8 (05) :1427-1439
[6]  
Blanchette M, 1996, GENE, V172, pGC11, DOI 10.1016/0378-1119(95)00878-0
[7]   Genome rearrangements with indels in intergenes restrict the scenario space [J].
Bulteau, Laurent ;
Fertin, Guillaume ;
Tannier, Eric .
BMC BIOINFORMATICS, 2016, 17
[8]   SORTING BY TRANSPOSITIONS IS DIFFICULT [J].
Bulteau, Laurent ;
Fertin, Guillaume ;
Rusu, Irena .
SIAM JOURNAL ON DISCRETE MATHEMATICS, 2012, 26 (03) :1148-1180
[10]   A 1.375-approximation algorithm for sorting by transpositions [J].
Elias, Isaac ;
Hartman, Tzvika .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2006, 3 (04) :369-379