Improved reference genome for the domestic horse increases assembly contiguity and composition

被引:160
作者
Kalbfleisch, Theodore S. [1 ]
Rice, Edward S. [2 ]
DePriest, Michael S., Jr. [1 ]
Walenz, Brian P. [3 ]
Hestand, Matthew S. [4 ]
Vermeesch, Joris R. [4 ]
O'Connell, Brendan L. [2 ,16 ]
Fiddes, Ian T. [2 ,5 ]
Vershinina, Alisa O. [6 ]
Saremi, Nedda F. [2 ]
Petersen, Jessica L. [7 ]
Finno, Carrie J. [8 ]
Bellone, Rebecca R. [8 ,9 ]
McCue, Molly E. [10 ]
Brooks, Samantha A. [11 ]
Bailey, Ernest [12 ]
Orlando, Ludovic [13 ,14 ]
Greene, Richard E. [2 ]
Miller, Donald C. [15 ]
Antczak, Douglas F. [15 ]
MacLeod, James N. [12 ]
机构
[1] Univ Louisville, Sch Med, Dept Biochem & Mol Genet, Louisville, KY 40292 USA
[2] UC Santa Cruz, Dept Biomol Engn, Santa Cruz, CA 95064 USA
[3] NHGRI, Genome Informat Sect, Computat & Stat Genom Branch, NIH, Bethesda, MD 20892 USA
[4] Katholieke Univ Leuven, Ctr Human Genet, B-3000 Leuven, Belgium
[5] 10x Genomics Inc, Pleasanton, CA 94566 USA
[6] UC Santa Cruz, Dept Ecol & Evolutionary Biol, Santa Cruz, CA 95064 USA
[7] Univ Nebraska, Dept Anim Sci, Lincoln, NE 68583 USA
[8] Univ Calif Davis, Dept Populat Hlth & Reprod, Davis, CA 95616 USA
[9] Univ Calif Davis, Vet Genet Lab, Davis, CA 95616 USA
[10] Univ Minnesota, Dept Vet Populat Med, St Paul, MN 55108 USA
[11] Univ Florida, UF Genet Inst, Dept Anim Sci, Gainesville, FL 32611 USA
[12] Univ Kentucky, Gluck Equine Res Ctr, Dept Vet Sci, Lexington, KY 40546 USA
[13] Nat Hist Museum Denmark, Ctr GeoGenet, DK-1350 Copenhagen, Denmark
[14] Univ Toulouse, Univ Paul Sabatier, Lab Anthropobiol Mol & Imagerie Synth, CNRS,UMR 5288, Toulouse, France
[15] Cornell Univ, Coll Vet Med, Baker Inst Anim Hlth, Ithaca, NY 14853 USA
[16] Oregon Hlth & Sci Univ, Med & Mol Genet, Portland, OR 97239 USA
基金
美国国家卫生研究院;
关键词
READ ALIGNMENT; MESSENGER-RNA; SEQUENCE; ANNOTATION; GENES;
D O I
10.1038/s42003-018-0199-z
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recent advances in genomic sequencing technology and computational assembly methods have allowed scientists to improve reference genome assemblies in terms of contiguity and composition. EquCab2, a reference genome for the domestic horse, was released in 2007. Although of equal or better quality compared to other first-generation Sanger assemblies, it had many of the shortcomings common to them. In 2014, the equine genomics research community began a project to improve the reference sequence for the horse, building upon the solid foundation of EquCab2 and incorporating new short-read data, long-read data, and proximity ligation data. Here, we present EquCab3. The count of non-N bases in the incorporated chromosomes is improved from 2.33 Gb in EquCab2 to 2.41 Gb in EquCab3. Contiguity has also been improved nearly 40-fold with a contig N50 of 4.5 Mb and scaffold contiguity enhanced to where all but one of the 32 chromosomes is comprised of a single scaffold.
引用
收藏
页数:8
相关论文
共 54 条
[41]   Allelic Variation in CXCL16 Determines CD3+ T Lymphocyte Susceptibility to Equine Arteritis Virus Infection and Establishment of Long-Term Carrier State in the Stallion [J].
Sarkar, Sanjay ;
Bailey, Ernest ;
Go, Yun Young ;
Cook, R. Frank ;
Kalbfleisch, Ted ;
Eberth, John ;
Chelvarajan, R. Lakshman ;
Shuck, Kathleen M. ;
Artiushin, Sergey ;
Timoney, Peter J. ;
Balasuriya, Udeni B. R. .
PLOS GENETICS, 2016, 12 (12)
[42]   Developing a 670k genotyping array to tag ∼2M SNPs across 24 horse breeds [J].
Schaefer, Robert J ;
Schubert, Mikkel K ;
Bailey, Ernest K ;
Bannasch, Danika L ;
Barrey, Eric P ;
Bar-Gal, Gila Kahila ;
Brem, Gottfried J ;
Brooks, Samantha A ;
Distl, Ottmar K ;
Fries, Ruedi K ;
Finno, Carrie J ;
Gerber, Vinzenz K ;
Haase, Bianca K ;
Jagannathan, Vidhya K ;
Kalbfleisch, Ted K ;
Leeb, Tosso K ;
Lindgren, Gabriella K ;
Lopes, Maria Susana ;
Mach, Nuria K ;
Machado, Artur daCamara ;
MacLeod, James N ;
McCoy, Annette K ;
Metzger, Julia K ;
Penedo, Cecilia K ;
Polani, Sagi K ;
Rieder, Stefan K ;
Tammen, Imke K ;
Tetens, Jens K ;
Thaller, Georg P ;
Verini-Supplizi, Andrea P ;
Wade, Claire M ;
Wallner, Barbara T ;
Orlando, Ludovic S ;
Mickelson, James R ;
McCue, Molly E .
BMC GENOMICS, 2017, 18
[43]   Quality control and preprocessing of metagenomic datasets [J].
Schmieder, Robert ;
Edwards, Robert .
BIOINFORMATICS, 2011, 27 (06) :863-864
[44]   AdapterRemoval v2: Rapid adapter trimming, identification, and read merging [J].
Schubert M. ;
Lindgreen S. ;
Orlando L. .
BMC Research Notes, 9 (1)
[45]   Prehistoric genomes reveal the genetic foundation and cost of horse domestication [J].
Schubert, Mikkel ;
Jonsson, Hakon ;
Chang, Dan ;
Sarkissian, Clio Der ;
Ermini, Luca ;
Ginolhac, Aurelien ;
Albrechtsen, Anders ;
Dupanloup, Isabelle ;
Foucal, Adrien ;
Petersen, Bent ;
Fumagalli, Matteo ;
Raghavan, Maanasa ;
Seguin-Orlando, Andaine ;
Korneliussen, Thorfinn S. ;
Velazquez, Amhed M. V. ;
Stenderup, Jesper ;
Hoover, Cindi A. ;
Rubin, Carl-Johan ;
Alfarhan, Ahmed H. ;
Alquraishi, Saleh A. ;
Al-Rasheid, Khaled A. S. ;
MacHugh, David E. ;
Kalbfleisch, Ted ;
MacLeod, James N. ;
Rubin, Edward M. ;
Sicheritz-Ponten, Thomas ;
Andersson, Leif ;
Hofreiter, Michael ;
Marques-Bonet, Tomas ;
Gilbert, M. Thomas P. ;
Nielsen, Rasmus ;
Excoffier, Laurent ;
Willerslev, Eske ;
Shapiro, Beth ;
Orlando, Ludovic .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2014, 111 (52) :E5661-E5669
[46]   BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs [J].
Simao, Felipe A. ;
Waterhouse, Robert M. ;
Ioannidis, Panagiotis ;
Kriventseva, Evgenia V. ;
Zdobnov, Evgeny M. .
BIOINFORMATICS, 2015, 31 (19) :3210-3212
[47]   Host genetic influence on papillomavirus-induced tumors in the horse [J].
Staiger, Elizabeth A. ;
Tseng, Chia T. ;
Miller, Donald ;
Cassano, Jennifer M. ;
Nasir, Lubna ;
Garrick, Dorian ;
Brooks, Samantha A. ;
Antczak, Douglas F. .
INTERNATIONAL JOURNAL OF CANCER, 2016, 139 (04) :784-792
[48]   Brother of CDO (BOC) expression in equine articular cartilage [J].
Vanderman, K. S. ;
Tremblay, M. ;
Zhu, W. ;
Shimojo, M. ;
Mienaltowski, M. J. ;
Coleman, S. J. ;
MacLeod, J. N. .
OSTEOARTHRITIS AND CARTILAGE, 2011, 19 (04) :435-438
[49]   Genome Sequence, Comparative Analysis, and Population Genetics of the Domestic Horse [J].
Wade, C. M. ;
Giulotto, E. ;
Sigurdsson, S. ;
Zoli, M. ;
Gnerre, S. ;
Imsland, F. ;
Lear, T. L. ;
Adelson, D. L. ;
Bailey, E. ;
Bellone, R. R. ;
Bloecker, H. ;
Distl, O. ;
Edgar, R. C. ;
Garber, M. ;
Leeb, T. ;
Mauceli, E. ;
MacLeod, J. N. ;
Penedo, M. C. T. ;
Raison, J. M. ;
Sharpe, T. ;
Vogel, J. ;
Andersson, L. ;
Antczak, D. F. ;
Biagi, T. ;
Binns, M. M. ;
Chowdhary, B. P. ;
Coleman, S. J. ;
Della Valle, G. ;
Fryc, S. ;
Guerin, G. ;
Hasegawa, T. ;
Hill, E. W. ;
Jurka, J. ;
Kiialainen, A. ;
Lindgren, G. ;
Liu, J. ;
Magnani, E. ;
Mickelson, J. R. ;
Murray, J. ;
Nergadze, S. G. ;
Onofrio, R. ;
Pedroni, S. ;
Piras, M. F. ;
Raudsepp, T. ;
Rocchi, M. ;
Roed, K. H. ;
Ryder, O. A. ;
Searle, S. ;
Skow, L. ;
Swinburne, J. E. .
SCIENCE, 2009, 326 (5954) :865-867
[50]   Figaro: a novel statistical method for vector sequence removal [J].
White, James Robert ;
Roberts, Michael ;
Yorke, James A. ;
Pop, Mihai .
BIOINFORMATICS, 2008, 24 (04) :462-467