Finding the missing honey bee genes: lessons learned from a genome upgrade

被引:311
作者
Elsik, Christine G. [1 ,2 ,3 ,4 ]
Worley, Kim C. [5 ]
Bennett, Anna K. [4 ]
Beye, Martin [6 ]
Camara, Francisco [7 ]
Childers, Christopher P. [4 ,8 ]
de Graaf, Dirk C. [9 ]
Debyser, Griet [10 ]
Deng, Jixin [5 ]
Devreese, Bart [10 ]
Elhaik, Eran [11 ]
Evans, Jay D. [12 ]
Foster, Leonard J. [13 ]
Graur, Dan [14 ]
Guigo, Roderic [7 ]
Hoff, Katharina Jasmin [15 ]
Holder, Michael E. [5 ]
Hudson, Matthew E. [16 ,17 ]
Hunt, Greg J. [18 ]
Jiang, Huaiyang [19 ]
Joshi, Vandita [5 ]
Khetani, Radhika S. [20 ]
Kosarev, Peter [21 ]
Kovar, Christie L. [5 ]
Ma, Jian [17 ,22 ]
Maleszka, Ryszard [23 ]
Moritz, Robin F. A. [24 ]
Munoz-Torres, Monica C. [4 ,25 ]
Murphy, Terence D. [26 ]
Muzny, Donna M. [5 ]
Newsham, Irene F. [5 ]
Reese, Justin T. [4 ,8 ]
Robertson, Hugh M. [27 ]
Robinson, Gene E. [28 ]
Rueppell, Olav [29 ]
Solovyev, Victor [30 ]
Stanke, Mario [15 ]
Stolle, Eckart [24 ]
Tsuruda, Jennifer M. [31 ]
Van Vaerenbergh, Matthias [9 ]
Waterhouse, Robert M. [32 ,33 ]
Weaver, Daniel B. [34 ]
Whitfield, Charles W. [35 ]
Wu, Yuanqing [5 ]
Zdobnov, Evgeny M. [32 ,33 ]
Zhang, Lan [5 ]
Zhu, Dianhui [5 ]
Gibbs, Richard A. [5 ]
机构
[1] Univ Missouri, Div Anim Sci, Columbia, MO 65211 USA
[2] Univ Missouri, Div Plant Sci, Columbia, MO 65211 USA
[3] Univ Missouri, MU Informat Inst, Columbia, MO 65211 USA
[4] Georgetown Univ, Dept Biol, Washington, DC 20057 USA
[5] Baylor Coll Med, Human Genome Sequencing Ctr, Dept Mol & Human Genet, Houston, TX 77030 USA
[6] Univ Dusseldorf, Inst Evolutionary Genet, D-40225 Dusseldorf, Germany
[7] Univ Pompeu Fabra, Ctr Genom Regulat, E-08003 Barcelona, Catalonia, Spain
[8] Univ Missouri, Div Anim Sci, Columbia, MO 65211 USA
[9] Univ Ghent, Lab Zoophysiol, B-9000 Ghent, Belgium
[10] Univ Ghent, Lab Prot Biochem & Biomol Engn, B-9000 Ghent, Belgium
[11] Johns Hopkins Univ, Bloomberg Sch Publ Hlth, Dept Mental Hlth, Baltimore, MD 21205 USA
[12] USDA ARS, BARC E, Bee Res Lab, Beltsville, MD 20705 USA
[13] Univ British Columbia, Ctr High Throughput Biol, Dept Biochem & Mol Biol, Vancouver, BC V5Z 1M9, Canada
[14] Univ Houston, Dept Biol & Biochem, Houston, TX 77204 USA
[15] Univ Greifswald, Inst Math & Comp Sci, D-17487 Greifswald, Germany
[16] Univ Illinois, Dept Crop Sci, Urbana, IL 61801 USA
[17] Univ Illinois, Inst Genom Biol, Urbana, IL 61801 USA
[18] Purdue Univ, Dept Entomol, W Lafayette, IN 47907 USA
[19] Univ Pittsburgh, Dept Obstet Gynecol & Reprod Sci, Pittsburgh, PA 15260 USA
[20] Univ Illinois, Roy J Carver Biotechnol Ctr, High Performance Biol Comp HPCBio, Urbana, IL 61801 USA
[21] Softberry Inc, Mt Kisco, NY 10549 USA
[22] Univ Illinois, Dept Bioengn, Urbana, IL 61801 USA
[23] Australian Natl Univ, Res Sch Biol, Canberra, ACT 0200, Australia
[24] Univ Halle Wittenberg, Inst Zool, D-06099 Halle, Saale, Germany
[25] Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Genom Div, Berkeley, CA 94720 USA
[26] NIH, Natl Ctr Biotechnol Informat, Natl Lib Med, Bethesda, MD 20894 USA
[27] Univ Illinois, Dept Entomol, Urbana, IL 61801 USA
[28] Univ Illinois, Inst Genom Biol, Dept Entomol, Neurosci Program, Urbana, IL 61801 USA
[29] Univ N Carolina, Dept Biol, Greensboro, NC 27412 USA
[30] King Abdullah Univ Sci & Technol, Comp Elect & Math Sci & Engn Div, Thuwal 239556900, Saudi Arabia
[31] Clemson Univ, Clemson, SC 29634 USA
[32] Univ Geneva, CH-1211 Geneva, Switzerland
[33] Swiss Inst Bioinformat, CMU, CH-1211 Geneva, Switzerland
[34] Genformatic, Austin, TX 78731 USA
[35] Univ Illinois, Dept Entomol, Neurosci Program, Program Ecol & Evolutionary Biol, Urbana, IL 61801 USA
来源
BMC GENOMICS | 2014年 / 15卷
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
Apis mellifera; GC content; Gene annotation; Gene prediction; Genome assembly; Genome improvement; Genome sequencing; Repetitive DNA; Transcriptome; APIS-MELLIFERA; CLASSIFICATION-SYSTEM; DOMAIN DATABASE; TANDEM REPEATS; DRAFT GENOME; ANNOTATION; EVOLUTION; SEQUENCE; EXPRESSION; PREDICTION;
D O I
10.1186/1471-2164-15-86
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: The first generation of genome sequence assemblies and annotations have had a significant impact upon our understanding of the biology of the sequenced species, the phylogenetic relationships among species, the study of populations within and across species, and have informed the biology of humans. As only a few Metazoan genomes are approaching finished quality (human, mouse, fly and worm), there is room for improvement of most genome assemblies. The honey bee (Apis mellifera) genome, published in 2006, was noted for its bimodal GC content distribution that affected the quality of the assembly in some regions and for fewer genes in the initial gene set (OGSv1.0) compared to what would be expected based on other sequenced insect genomes. Results: Here, we report an improved honey bee genome assembly (Amel_4.5) with a new gene annotation set (OGSv3.2), and show that the honey bee genome contains a number of genes similar to that of other insect genomes, contrary to what was suggested in OGSv1.0. The new genome assembly is more contiguous and complete and the new gene set includes similar to 5000 more protein-coding genes, 50% more than previously reported. About 1/6 of the additional genes were due to improvements to the assembly, and the remaining were inferred based on new RNAseq and protein data. Conclusions: Lessons learned from this genome upgrade have important implications for future genome sequencing projects. Furthermore, the improvements significantly enhance genomic resources for the honey bee, a key model for social behavior and essential to global ecology through pollination.
引用
收藏
页数:29
相关论文
共 108 条
  • [1] Nutrigenomics in honey bees: digital gene expression analysis of pollen's nutritive effects on healthy and varroa-parasitized bees
    Alaux, Cedric
    Dantec, Christelle
    Parrinello, Hughes
    Le Conte, Yves
    [J]. BMC GENOMICS, 2011, 12
  • [2] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [3] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [4] Mechanisms of stable lipid loss in a social insect
    Ament, Seth A.
    Chan, Queenie W.
    Wheeler, Marsha M.
    Nixon, Scott E.
    Johnson, S. Peir
    Rodriguez-Zas, Sandra L.
    Foster, Leonard J.
    Robinson, Gene E.
    [J]. JOURNAL OF EXPERIMENTAL BIOLOGY, 2011, 214 (22) : 3808 - 3821
  • [5] [Anonymous], ISRN BIOINFORMATICS
  • [6] [Anonymous], J PROTEOMIC IN PRESS
  • [7] [Anonymous], NUCLEIC ACIDS RES
  • [8] [Anonymous], CURRENT TOPICS COMPU
  • [9] Update on activities at the Universal Protein Resource (UniProt) in 2013
    Apweiler, Rolf
    Martin, Maria Jesus
    O'Donovan, Claire
    Magrane, Michele
    Alam-Faruque, Yasmin
    Alpi, Emanuela
    Antunes, Ricardo
    Arganiska, Joanna
    Casanova, Elisabet Barrera
    Bely, Benoit
    Bingley, Mark
    Bonilla, Carlos
    Britto, Ramona
    Bursteinas, Borisas
    Chan, Wei Mun
    Chavali, Gayatri
    Cibrian-Uhalte, Elena
    Da Silva, Alan
    De Giorgi, Maurizio
    Dimmer, Emily
    Fazzini, Francesco
    Gane, Paul
    Fedotov, Alexander
    Castro, Leyla Garcia
    Garmiri, Penelope
    Hatton-Ellis, Emma
    Hieta, Reija
    Huntley, Rachael
    Jacobsen, Julius
    Jones, Rachel
    Legge, Duncan
    Liu, Wudong
    Luo, Jie
    MacDougall, Alistair
    Mutowo, Prudence
    Nightingale, Andrew
    Orchard, Sandra
    Patient, Samuel
    Pichler, Klemens
    Poggioli, Diego
    Pundir, Sangya
    Pureza, Luis
    Qi, Guoying
    Rosanoff, Steven
    Sawford, Tony
    Sehra, Harminder
    Turner, Edward
    Volynkin, Vladimir
    Wardell, Tony
    Watkins, Xavier
    [J]. NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) : D43 - D47
  • [10] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29