Improved reference genome for the domestic horse increases assembly contiguity and composition

被引:160
作者
Kalbfleisch, Theodore S. [1 ]
Rice, Edward S. [2 ]
DePriest, Michael S., Jr. [1 ]
Walenz, Brian P. [3 ]
Hestand, Matthew S. [4 ]
Vermeesch, Joris R. [4 ]
O'Connell, Brendan L. [2 ,16 ]
Fiddes, Ian T. [2 ,5 ]
Vershinina, Alisa O. [6 ]
Saremi, Nedda F. [2 ]
Petersen, Jessica L. [7 ]
Finno, Carrie J. [8 ]
Bellone, Rebecca R. [8 ,9 ]
McCue, Molly E. [10 ]
Brooks, Samantha A. [11 ]
Bailey, Ernest [12 ]
Orlando, Ludovic [13 ,14 ]
Greene, Richard E. [2 ]
Miller, Donald C. [15 ]
Antczak, Douglas F. [15 ]
MacLeod, James N. [12 ]
机构
[1] Univ Louisville, Sch Med, Dept Biochem & Mol Genet, Louisville, KY 40292 USA
[2] UC Santa Cruz, Dept Biomol Engn, Santa Cruz, CA 95064 USA
[3] NHGRI, Genome Informat Sect, Computat & Stat Genom Branch, NIH, Bethesda, MD 20892 USA
[4] Katholieke Univ Leuven, Ctr Human Genet, B-3000 Leuven, Belgium
[5] 10x Genomics Inc, Pleasanton, CA 94566 USA
[6] UC Santa Cruz, Dept Ecol & Evolutionary Biol, Santa Cruz, CA 95064 USA
[7] Univ Nebraska, Dept Anim Sci, Lincoln, NE 68583 USA
[8] Univ Calif Davis, Dept Populat Hlth & Reprod, Davis, CA 95616 USA
[9] Univ Calif Davis, Vet Genet Lab, Davis, CA 95616 USA
[10] Univ Minnesota, Dept Vet Populat Med, St Paul, MN 55108 USA
[11] Univ Florida, UF Genet Inst, Dept Anim Sci, Gainesville, FL 32611 USA
[12] Univ Kentucky, Gluck Equine Res Ctr, Dept Vet Sci, Lexington, KY 40546 USA
[13] Nat Hist Museum Denmark, Ctr GeoGenet, DK-1350 Copenhagen, Denmark
[14] Univ Toulouse, Univ Paul Sabatier, Lab Anthropobiol Mol & Imagerie Synth, CNRS,UMR 5288, Toulouse, France
[15] Cornell Univ, Coll Vet Med, Baker Inst Anim Hlth, Ithaca, NY 14853 USA
[16] Oregon Hlth & Sci Univ, Med & Mol Genet, Portland, OR 97239 USA
基金
美国国家卫生研究院;
关键词
READ ALIGNMENT; MESSENGER-RNA; SEQUENCE; ANNOTATION; GENES;
D O I
10.1038/s42003-018-0199-z
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recent advances in genomic sequencing technology and computational assembly methods have allowed scientists to improve reference genome assemblies in terms of contiguity and composition. EquCab2, a reference genome for the domestic horse, was released in 2007. Although of equal or better quality compared to other first-generation Sanger assemblies, it had many of the shortcomings common to them. In 2014, the equine genomics research community began a project to improve the reference sequence for the horse, building upon the solid foundation of EquCab2 and incorporating new short-read data, long-read data, and proximity ligation data. Here, we present EquCab3. The count of non-N bases in the incorporated chromosomes is improved from 2.33 Gb in EquCab2 to 2.41 Gb in EquCab3. Contiguity has also been improved nearly 40-fold with a contig N50 of 4.5 Mb and scaffold contiguity enhanced to where all but one of the 32 chromosomes is comprised of a single scaffold.
引用
收藏
页数:8
相关论文
共 54 条
[21]   Fast and accurate long-read alignment with Burrows-Wheeler transform [J].
Li, Heng ;
Durbin, Richard .
BIOINFORMATICS, 2010, 26 (05) :589-595
[22]   Fast and accurate short read alignment with Burrows-Wheeler transform [J].
Li, Heng ;
Durbin, Richard .
BIOINFORMATICS, 2009, 25 (14) :1754-1760
[23]   Ancient genomic changes associated with domestication of the horse [J].
Librado, Pablo ;
Gamba, Cristina ;
Gaunitz, Charleen ;
Sarkissian, Clio Der ;
Pruvost, Melanie ;
Albrechtsen, Anders ;
Fages, Antoine ;
Khan, Naveed ;
Schubert, Mikkel ;
Jagannathan, Vidhya ;
Serres-Armero, Aitor ;
Kuderna, Lukas F. K. ;
Povolotskaya, Inna S. ;
Seguin-Orlando, Andaine ;
Lepetz, Sebastien ;
Neuditschko, Markus ;
Theves, Catherine ;
Alquraishi, Saleh ;
Alfarhan, Ahmed H. ;
Al-Rasheid, Khaled ;
Rieder, Stefan ;
Samashev, Zainolla ;
Francfort, Henri-Paul ;
Benecke, Norbert ;
Hofreiter, Michael ;
Ludwig, Arne ;
Keyser, Christine ;
Marques-Bonet, Tomas ;
Ludes, Bertrand ;
Crubezy, Eric ;
Leeb, Tosso ;
Willerslev, Eske ;
Orlando, Ludovic .
SCIENCE, 2017, 356 (6336) :442-445
[24]   Tracking the origins of Yakutian horses and the genetic basis for their fast adaptation to subarctic environments [J].
Librado, Pablo ;
Sarkissian, Clio Der ;
Ermini, Luca ;
Schubert, Mikkel ;
Jonsson, Hakon ;
Albrechtsen, Anders ;
Fumagalli, Matteo ;
Yang, Melinda A. ;
Gambo, Cristina ;
Seguin-Orlando, Andaine ;
Mortensen, Cecilie D. ;
Petersen, Bent ;
Hoover, Cindi A. ;
Lorente-Galdos, Belen ;
Nedoluzhko, Artem ;
Boulygina, Eugenia ;
Tsygankova, Svetlana ;
Neuditschko, Markus ;
Jagannathan, Vidhya ;
Theves, Catherine ;
Alfarhan, Ahmed H. ;
Alquraishi, Saleh A. ;
Al-Rasheid, Khaled A. S. ;
Sicheritz-Ponten, Thomas ;
Popov, Ruslan ;
Grigoriev, Semyon ;
Alekseev, Anatoly N. ;
Rubin, Edward M. ;
McCue, Molly ;
Rieder, Stefan ;
Leeb, Tosso ;
Tikhonov, Alexei ;
Crubezy, Eric ;
Slatkin, Montgomery ;
Marques-Bonet, Tomas ;
Nielsen, Rasmus ;
Willerslev, Eske ;
Kantanen, Juha ;
Prokhortchouk, Egor ;
Orlando, Ludovic .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2015, 112 (50) :E6889-E6897
[25]   Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome [J].
Lieberman-Aiden, Erez ;
van Berkum, Nynke L. ;
Williams, Louise ;
Imakaev, Maxim ;
Ragoczy, Tobias ;
Telling, Agnes ;
Amit, Ido ;
Lajoie, Bryan R. ;
Sabo, Peter J. ;
Dorschner, Michael O. ;
Sandstrom, Richard ;
Bernstein, Bradley ;
Bender, M. A. ;
Groudine, Mark ;
Gnirke, Andreas ;
Stamatoyannopoulos, John ;
Mirny, Leonid A. ;
Lander, Eric S. ;
Dekker, Job .
SCIENCE, 2009, 326 (5950) :289-293
[26]  
Marks P., Resolving the Full Spectrum of Human Genome Variation using Linked-Reads, DOI DOI 10.1101/230946
[27]  
Martin M., 2011, EMBNET J, V17, P10, DOI [10.14806/ej.17.1.200, DOI 10.14806/EJ.17.1.200]
[28]   A High Density SNP Array for the Domestic Horse and Extant Perissodactyla: Utility for Association Mapping, Genetic Diversity, and Phylogeny Studies [J].
McCue, Molly E. ;
Bannasch, Danika L. ;
Petersen, Jessica L. ;
Gurr, Jessica ;
Bailey, Ernie ;
Binns, Matthew M. ;
Distl, Ottmar ;
Guerin, Gerard ;
Hasegawa, Telhisa ;
Hill, Emmeline W. ;
Leeb, Tosso ;
Lindgren, Gabriella ;
Penedo, M. Cecilia T. ;
Roed, Knut H. ;
Ryder, Oliver A. ;
Swinburne, June E. ;
Tozaki, Teruaki ;
Valberg, Stephanie J. ;
Vaudin, Mark ;
Lindblad-Toh, Kerstin ;
Wade, Claire M. ;
Mickelson, James R. .
PLOS GENETICS, 2012, 8 (01)
[29]   The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data [J].
McKenna, Aaron ;
Hanna, Matthew ;
Banks, Eric ;
Sivachenko, Andrey ;
Cibulskis, Kristian ;
Kernytsky, Andrew ;
Garimella, Kiran ;
Altshuler, David ;
Gabriel, Stacey ;
Daly, Mark ;
DePristo, Mark A. .
GENOME RESEARCH, 2010, 20 (09) :1297-1303
[30]   Aggressive assembly of pyrosequencing reads with mates [J].
Miller, Jason R. ;
Delcher, Arthur L. ;
Koren, Sergey ;
Venter, Eli ;
Walenz, Brian P. ;
Brownley, Anushka ;
Johnson, Justin ;
Li, Kelvin ;
Mobarry, Clark ;
Sutton, Granger .
BIOINFORMATICS, 2008, 24 (24) :2818-2824