Finding the missing honey bee genes: lessons learned from a genome upgrade

被引:323
作者
Elsik, Christine G. [1 ,2 ,3 ,4 ]
Worley, Kim C. [5 ]
Bennett, Anna K. [4 ]
Beye, Martin [6 ]
Camara, Francisco [7 ]
Childers, Christopher P. [4 ,8 ]
de Graaf, Dirk C. [9 ]
Debyser, Griet [10 ]
Deng, Jixin [5 ]
Devreese, Bart [10 ]
Elhaik, Eran [11 ]
Evans, Jay D. [12 ]
Foster, Leonard J. [13 ]
Graur, Dan [14 ]
Guigo, Roderic [7 ]
Hoff, Katharina Jasmin [15 ]
Holder, Michael E. [5 ]
Hudson, Matthew E. [16 ,17 ]
Hunt, Greg J. [18 ]
Jiang, Huaiyang [19 ]
Joshi, Vandita [5 ]
Khetani, Radhika S. [20 ]
Kosarev, Peter [21 ]
Kovar, Christie L. [5 ]
Ma, Jian [17 ,22 ]
Maleszka, Ryszard [23 ]
Moritz, Robin F. A. [24 ]
Munoz-Torres, Monica C. [4 ,25 ]
Murphy, Terence D. [26 ]
Muzny, Donna M. [5 ]
Newsham, Irene F. [5 ]
Reese, Justin T. [4 ,8 ]
Robertson, Hugh M. [27 ]
Robinson, Gene E. [28 ]
Rueppell, Olav [29 ]
Solovyev, Victor [30 ]
Stanke, Mario [15 ]
Stolle, Eckart [24 ]
Tsuruda, Jennifer M. [31 ]
Van Vaerenbergh, Matthias [9 ]
Waterhouse, Robert M. [32 ,33 ]
Weaver, Daniel B. [34 ]
Whitfield, Charles W. [35 ]
Wu, Yuanqing [5 ]
Zdobnov, Evgeny M. [32 ,33 ]
Zhang, Lan [5 ]
Zhu, Dianhui [5 ]
Gibbs, Richard A. [5 ]
机构
[1] Univ Missouri, Div Anim Sci, Columbia, MO 65211 USA
[2] Univ Missouri, Div Plant Sci, Columbia, MO 65211 USA
[3] Univ Missouri, MU Informat Inst, Columbia, MO 65211 USA
[4] Georgetown Univ, Dept Biol, Washington, DC 20057 USA
[5] Baylor Coll Med, Human Genome Sequencing Ctr, Dept Mol & Human Genet, Houston, TX 77030 USA
[6] Univ Dusseldorf, Inst Evolutionary Genet, D-40225 Dusseldorf, Germany
[7] Univ Pompeu Fabra, Ctr Genom Regulat, E-08003 Barcelona, Catalonia, Spain
[8] Univ Missouri, Div Anim Sci, Columbia, MO 65211 USA
[9] Univ Ghent, Lab Zoophysiol, B-9000 Ghent, Belgium
[10] Univ Ghent, Lab Prot Biochem & Biomol Engn, B-9000 Ghent, Belgium
[11] Johns Hopkins Univ, Bloomberg Sch Publ Hlth, Dept Mental Hlth, Baltimore, MD 21205 USA
[12] USDA ARS, BARC E, Bee Res Lab, Beltsville, MD 20705 USA
[13] Univ British Columbia, Ctr High Throughput Biol, Dept Biochem & Mol Biol, Vancouver, BC V5Z 1M9, Canada
[14] Univ Houston, Dept Biol & Biochem, Houston, TX 77204 USA
[15] Univ Greifswald, Inst Math & Comp Sci, D-17487 Greifswald, Germany
[16] Univ Illinois, Dept Crop Sci, Urbana, IL 61801 USA
[17] Univ Illinois, Inst Genom Biol, Urbana, IL 61801 USA
[18] Purdue Univ, Dept Entomol, W Lafayette, IN 47907 USA
[19] Univ Pittsburgh, Dept Obstet Gynecol & Reprod Sci, Pittsburgh, PA 15260 USA
[20] Univ Illinois, Roy J Carver Biotechnol Ctr, High Performance Biol Comp HPCBio, Urbana, IL 61801 USA
[21] Softberry Inc, Mt Kisco, NY 10549 USA
[22] Univ Illinois, Dept Bioengn, Urbana, IL 61801 USA
[23] Australian Natl Univ, Res Sch Biol, Canberra, ACT 0200, Australia
[24] Univ Halle Wittenberg, Inst Zool, D-06099 Halle, Saale, Germany
[25] Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Genom Div, Berkeley, CA 94720 USA
[26] NIH, Natl Ctr Biotechnol Informat, Natl Lib Med, Bethesda, MD 20894 USA
[27] Univ Illinois, Dept Entomol, Urbana, IL 61801 USA
[28] Univ Illinois, Inst Genom Biol, Dept Entomol, Neurosci Program, Urbana, IL 61801 USA
[29] Univ N Carolina, Dept Biol, Greensboro, NC 27412 USA
[30] King Abdullah Univ Sci & Technol, Comp Elect & Math Sci & Engn Div, Thuwal 239556900, Saudi Arabia
[31] Clemson Univ, Clemson, SC 29634 USA
[32] Univ Geneva, CH-1211 Geneva, Switzerland
[33] Swiss Inst Bioinformat, CMU, CH-1211 Geneva, Switzerland
[34] Genformatic, Austin, TX 78731 USA
[35] Univ Illinois, Dept Entomol, Neurosci Program, Program Ecol & Evolutionary Biol, Urbana, IL 61801 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Apis mellifera; GC content; Gene annotation; Gene prediction; Genome assembly; Genome improvement; Genome sequencing; Repetitive DNA; Transcriptome; APIS-MELLIFERA; CLASSIFICATION-SYSTEM; DOMAIN DATABASE; TANDEM REPEATS; DRAFT GENOME; ANNOTATION; EVOLUTION; SEQUENCE; EXPRESSION; PREDICTION;
D O I
10.1186/1471-2164-15-86
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: The first generation of genome sequence assemblies and annotations have had a significant impact upon our understanding of the biology of the sequenced species, the phylogenetic relationships among species, the study of populations within and across species, and have informed the biology of humans. As only a few Metazoan genomes are approaching finished quality (human, mouse, fly and worm), there is room for improvement of most genome assemblies. The honey bee (Apis mellifera) genome, published in 2006, was noted for its bimodal GC content distribution that affected the quality of the assembly in some regions and for fewer genes in the initial gene set (OGSv1.0) compared to what would be expected based on other sequenced insect genomes. Results: Here, we report an improved honey bee genome assembly (Amel_4.5) with a new gene annotation set (OGSv3.2), and show that the honey bee genome contains a number of genes similar to that of other insect genomes, contrary to what was suggested in OGSv1.0. The new genome assembly is more contiguous and complete and the new gene set includes similar to 5000 more protein-coding genes, 50% more than previously reported. About 1/6 of the additional genes were due to improvements to the assembly, and the remaining were inferred based on new RNAseq and protein data. Conclusions: Lessons learned from this genome upgrade have important implications for future genome sequencing projects. Furthermore, the improvements significantly enhance genomic resources for the honey bee, a key model for social behavior and essential to global ecology through pollination.
引用
收藏
页数:29
相关论文
共 108 条
[1]   Nutrigenomics in honey bees: digital gene expression analysis of pollen's nutritive effects on healthy and varroa-parasitized bees [J].
Alaux, Cedric ;
Dantec, Christelle ;
Parrinello, Hughes ;
Le Conte, Yves .
BMC GENOMICS, 2011, 12
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[4]   Mechanisms of stable lipid loss in a social insect [J].
Ament, Seth A. ;
Chan, Queenie W. ;
Wheeler, Marsha M. ;
Nixon, Scott E. ;
Johnson, S. Peir ;
Rodriguez-Zas, Sandra L. ;
Foster, Leonard J. ;
Robinson, Gene E. .
JOURNAL OF EXPERIMENTAL BIOLOGY, 2011, 214 (22) :3808-3821
[5]  
[Anonymous], ISRN BIOINFORMATICS
[6]  
[Anonymous], J PROTEOMIC IN PRESS
[7]  
[Anonymous], NUCLEIC ACIDS RES
[8]  
[Anonymous], CURRENT TOPICS COMPU
[9]   Update on activities at the Universal Protein Resource (UniProt) in 2013 [J].
Apweiler, Rolf ;
Martin, Maria Jesus ;
O'Donovan, Claire ;
Magrane, Michele ;
Alam-Faruque, Yasmin ;
Alpi, Emanuela ;
Antunes, Ricardo ;
Arganiska, Joanna ;
Casanova, Elisabet Barrera ;
Bely, Benoit ;
Bingley, Mark ;
Bonilla, Carlos ;
Britto, Ramona ;
Bursteinas, Borisas ;
Chan, Wei Mun ;
Chavali, Gayatri ;
Cibrian-Uhalte, Elena ;
Da Silva, Alan ;
De Giorgi, Maurizio ;
Dimmer, Emily ;
Fazzini, Francesco ;
Gane, Paul ;
Fedotov, Alexander ;
Castro, Leyla Garcia ;
Garmiri, Penelope ;
Hatton-Ellis, Emma ;
Hieta, Reija ;
Huntley, Rachael ;
Jacobsen, Julius ;
Jones, Rachel ;
Legge, Duncan ;
Liu, Wudong ;
Luo, Jie ;
MacDougall, Alistair ;
Mutowo, Prudence ;
Nightingale, Andrew ;
Orchard, Sandra ;
Patient, Samuel ;
Pichler, Klemens ;
Poggioli, Diego ;
Pundir, Sangya ;
Pureza, Luis ;
Qi, Guoying ;
Rosanoff, Steven ;
Sawford, Tony ;
Sehra, Harminder ;
Turner, Edward ;
Volynkin, Vladimir ;
Wardell, Tony ;
Watkins, Xavier .
NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) :D43-D47
[10]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29