The de novo genome assembly and annotation of a female domestic dromedary of North African origin

被引:49
作者
Fitak, Robert R. [1 ]
Mohandesan, Elmira [1 ]
Corander, Jukka [2 ]
Burger, Pamela A. [1 ]
机构
[1] Vetmeduni Vienna, Inst Populat Genet, Vet Pl 1, A-1210 Vienna, Austria
[2] Univ Helsinki, Dept Math & Stat, FIN-0014 Helsinki, Finland
基金
奥地利科学基金会; 芬兰科学院;
关键词
adaptation; Camelus dromedarius; demography; domestication; next-generation sequencing; REPRODUCIBLE RESEARCH; SEQUENCING DATA; IDENTIFICATION; INFERENCE; EVOLUTION; ALIGNMENT; PIPELINE; CAMEL; TOOL; DATABASE;
D O I
10.1111/1755-0998.12443
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The single-humped dromedary (Camelus dromedarius) is the most numerous and widespread of domestic camel species and is a significant source of meat, milk, wool, transportation and sport for millions of people. Dromedaries are particularly well adapted to hot, desert conditions and harbour a variety of biological and physiological characteristics with evolutionary, economic and medical importance. To understand the genetic basis of these traits, an extensive resource of genomic variation is required. In this study, we assembled at 653 coverage, a 2.06 Gb draft genome of a female dromedary whose ancestry can be traced to an isolated population from the Canary Islands. We annotated 21 167 protein-coding genes and estimated similar to 33.7% of the genome to be repetitive. A comparison with the recently published draft genome of an Arabian dromedary resulted in 1.91 Gb of aligned sequence with a divergence of 0.095%. An evaluation of our genome with the reference revealed that our assembly contains more error-free bases (91.2%) and fewer scaffolding errors. We identified similar to 1.4 million single-nucleotide polymorphisms with a mean density of 0.71 x 10(-3) per base. An analysis of demographic history indicated that changes in effective population size corresponded with recent glacial epochs. Our de novo assembly provides a useful resource of genomic variation for future studies of the camel's adaptations to arid environments and economically important traits. Furthermore, these results suggest that draft genome assemblies constructed with only two differently sized sequencing libraries can be comparable to those sequenced using additional library sizes, highlighting that additional resources might be better placed in technologies alternative to short-read sequencing to physically anchor scaffolds to genome maps.
引用
收藏
页码:314 / 324
页数:11
相关论文
共 72 条
  • [1] Effect of camel milk on glycemic control and insulin requirement in patients with type 1 diabetes: 2-years randomized controlled trial
    Agrawal, R. P.
    Jain, S.
    Shah, S.
    Chopra, A.
    Agarwal, V.
    [J]. EUROPEAN JOURNAL OF CLINICAL NUTRITION, 2011, 65 (09) : 1048 - 1052
  • [2] Population history and natural selection shape patterns of genetic variation in 132 genes
    Akey, JM
    Eberle, MA
    Rieder, MJ
    Carlson, CS
    Shriver, MD
    Nickerson, DA
    Kruglyak, L
    [J]. PLOS BIOLOGY, 2004, 2 (10) : 1591 - 1599
  • [3] Sequencing, Analysis, and Annotation of Expressed Sequence Tags for Camelus dromedarius
    Al-Swailem, Abdulaziz M.
    Shehata, Maher M.
    Abu-Duhier, Faisel M.
    Al-Yamani, Essam J.
    Al-Busadah, Khalid A.
    Al-Arawi, Mohammed S.
    Al-Khider, Ali Y.
    Al-Muhaimeed, Abdullah N.
    Al-Qahtani, Fahad H.
    Manee, Manee M.
    Al-Shomrani, Badr M.
    Al-Qhtani, Saad M.
    Al-Harthi, Amer S.
    Akdemir, Kadir C.
    Inan, Mehmet S.
    Otu, Hasan H.
    [J]. PLOS ONE, 2010, 5 (05):
  • [4] Mugsy: fast multiple alignment of closely related whole genomes
    Angiuoli, Samuel V.
    Salzberg, Steven L.
    [J]. BIOINFORMATICS, 2011, 27 (03) : 334 - 342
  • [5] [Anonymous], 2002, J OMAN STUDIES
  • [6] Evaluation of variant identification methods for whole genome sequencing data in dairy cattle
    Baes, Christine F.
    Dolezal, Marlies A.
    Koltes, James E.
    Bapst, Beat
    Fritz-Waters, Eric
    Jansen, Sandra
    Flury, Christine
    Signer-Hasler, Heidi
    Stricker, Christian
    Fernando, Rohan
    Fries, Ruedi
    Moll, Juerg
    Garrick, Dorian J.
    Reecy, James M.
    Gredler, Birgit
    [J]. BMC GENOMICS, 2014, 15
  • [7] Automated de novo identification of repeat sequence families in sequenced genomes
    Bao, ZR
    Eddy, SR
    [J]. GENOME RESEARCH, 2002, 12 (08) : 1269 - 1276
  • [8] Tandem repeats finder: a program to analyze DNA sequences
    Benson, G
    [J]. NUCLEIC ACIDS RESEARCH, 1999, 27 (02) : 573 - 580
  • [9] Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species
    Bradnam, Keith R.
    Fass, Joseph N.
    Alexandrov, Anton
    Baranay, Paul
    Bechner, Michael
    Birol, Inanc
    Boisvert, Sebastien
    Chapman, Jarrod A.
    Chapuis, Guillaume
    Chikhi, Rayan
    Chitsaz, Hamidreza
    Chou, Wen-Chi
    Corbeil, Jacques
    Del Fabbro, Cristian
    Docking, T. Roderick
    Durbin, Richard
    Earl, Dent
    Emrich, Scott
    Fedotov, Pavel
    Fonseca, Nuno A.
    Ganapathy, Ganeshkumar
    Gibbs, Richard A.
    Gnerre, Sante
    Godzaridis, Elenie
    Goldstein, Steve
    Haimel, Matthias
    Hall, Giles
    Haussler, David
    Hiatt, Joseph B.
    Ho, Isaac Y.
    Howard, Jason
    Hunt, Martin
    Jackman, Shaun D.
    Jaffe, David B.
    Jarvis, Erich D.
    Jiang, Huaiyang
    Kazakov, Sergey
    Kersey, Paul J.
    Kitzman, Jacob O.
    Knight, James R.
    Koren, Sergey
    Lam, Tak-Wah
    Lavenier, Dominique
    Laviolette, Francois
    Li, Yingrui
    Li, Zhenyu
    Liu, Binghang
    Liu, Yue
    Luo, Ruibang
    MacCallum, Iain
    [J]. GIGASCIENCE, 2013, 2
  • [10] Bulliet RichardW., 1990, The Camel and the Wheel