GENCODE 2021

被引:726
作者
Frankish, Adam [1 ]
Diekhans, Mark [2 ]
Jungreis, Irwin [3 ,4 ]
Lagarde, Julien [5 ]
Loveland, Jane E. [1 ]
Mudge, Jonathan M. [1 ]
Sisu, Cristina [6 ,7 ]
Wright, James C. [8 ]
Armstrong, Joel [2 ]
Barnes, If [1 ]
Berry, Andrew [1 ]
Bignell, Alexandra [1 ]
Boix, Carles [3 ,4 ,9 ]
Carbonell Sala, Silvia [5 ]
Cunningham, Fiona [1 ]
Di Domenico, Tomas [10 ]
Donaldson, Sarah [1 ]
Fiddes, Ian T. [2 ]
Giron, Carlos Garcia [1 ]
Gonzalez, Jose Manuel [1 ]
Grego, Tiago [1 ]
Hardy, Matthew [1 ]
Hourlier, Thibaut [1 ]
Howe, Kevin L. [1 ]
Hunt, Toby [1 ]
Izuogu, Osagie G. [1 ]
Johnson, Rory [11 ,12 ]
Martin, Fergal J. [1 ]
Martinez, Laura [10 ]
Mohanan, Shamika [1 ]
Muir, Paul [13 ,14 ]
Navarro, Fabio C. P. [6 ]
Parker, Anne [1 ]
Pei, Baikang [6 ]
Pozo, Fernando [10 ]
Riera, Ferriol Calvet [1 ]
Ruffier, Magali [1 ]
Schmitt, Bianca M. [1 ]
Stapleton, Eloise [1 ]
Suner, Marie-Marthe [1 ]
Sycheva, Irina [1 ]
Uszczynska-Ratajczak, Barbara [15 ]
Wolf, Maxim Y. [16 ]
Xu, Jinuri [6 ]
Yang, Yucheng T. [6 ,17 ]
Yates, Andrew [1 ]
Zerbino, Daniel [1 ]
Zhang, Yan [6 ,18 ]
Choudhary, Jyoti S. [8 ]
Gerstein, Mark [6 ,17 ,19 ]
机构
[1] European Bioinformat Inst, European Mol Biol Lab, Wellcome Genome Campus, Cambridge CB10 1SD, England
[2] Univ Calif Santa Cruz, UC Santa Cruz Genom Inst, Santa Cruz, CA 95064 USA
[3] MIT, Comp Sci & Artificial Intelligence Lab, 32 Vassar St, Cambridge, MA 02139 USA
[4] Broad Inst MIT & Harvard, 415 Main St, Cambridge, MA 02142 USA
[5] Barcelona Inst Sci & Technol, Ctr Genom Regulat CRG, Dr Aiguader 88, E-08003 Barcelona, Catalonia, Spain
[6] Yale Univ, Dept Mol Biophys & Biochem, New Haven, CT 06520 USA
[7] Brunel Univ London, Dept Biosci, Uxbridge UB8 3PH, Middx, England
[8] Inst Canc Res, Div Canc Biol, Funct Prote, 237 Fulham Rd, London SW3 6JB, England
[9] MIT, Computat & Syst Biol Program, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[10] Spanish Natl Canc Res Ctr CNIO, Bioinformat Unit, Madrid, Spain
[11] Univ Bern, Univ Hosp, Dept Med Oncol, Inselspital, Bern, Switzerland
[12] Univ Bern, Dept Biomed Res DBMR, Bern, Switzerland
[13] Yale Univ, Dept Mol Cellular & Dev Biol, New Haven, CT 06520 USA
[14] Yale Univ, Syst Biol Inst, West Haven, CT 06516 USA
[15] Univ Warsaw, Ctr New Technol, Warsaw, Poland
[16] Harvard Med Sch, Dept Biomed Informat, 10 Shattuck St,Suite 514, Boston, MA 02115 USA
[17] Yale Univ, Program Computat Biol & Bioinformat, Bass 432,266 Whitney Ave, New Haven, CT 06520 USA
[18] Ohio State Univ, Coll Med, Dept Biomed Informat, Columbus, OH 43210 USA
[19] Yale Univ, Dept Comp Sci, Bass 432,266 Whitney Ave, New Haven, CT 06520 USA
[20] Univ Pompeu Fabra UPF, E-08003 Barcelona, Catalonia, Spain
[21] Guys Hosp, Kings Coll London, Dept Med & Mol Genet, Great Maze Pond, London SE1 9RT, England
基金
美国国家卫生研究院; 英国惠康基金; 英国生物技术与生命科学研究理事会; 瑞士国家科学基金会; 英国医学研究理事会;
关键词
LONG NONCODING RNAS; ANNOTATION; DATABASE; ATLAS;
D O I
10.1093/nar/gkaa1087
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The GENCODE project annotates human and mouse genes and transcripts supported by experimental data with high accuracy, providing a foundational resource that supports genome biology and clinical genomics. GENCODE annotation processes make use of primary data and bioinformatic tools and analysis generated both within the consortium and externally to support the creation of transcript structures and the determination of their function. Here, we present improvements to our annotation infrastructure, bioinformatics tools, and analysis, and the advances they support in the annotation of the human and mouse genomes including: the completion of first pass manual annotation for the mouse reference genome; targeted improvements to the annotation of genes associated with SARS-CoV-2 infection; collaborative projects to achieve convergence across reference annotation databases for the annotation of human and mouse protein-coding genes; and the first GENCODE manually supervised automated annotation of lncRNAs.
引用
收藏
页码:D916 / D923
页数:8
相关论文
共 33 条
[1]   The Ensembl gene annotation system [J].
Aken, Bronwen L. ;
Ayling, Sarah ;
Barrell, Daniel ;
Clarke, Laura ;
Curwen, Valery ;
Fairley, Susan ;
Banet, Julio Fernandez ;
Billis, Konstantinos ;
Giron, Carlos Garcia ;
Hourlier, Thibaut ;
Howe, Kevin ;
Kahari, Andreas ;
Kokocinski, Felix ;
Martin, Fergal J. ;
Murphy, Daniel N. ;
Nag, Rishi ;
Ruffier, Magali ;
Schuster, Michael ;
Tang, Y. Amy ;
Vogel, Jan-Hinnerk ;
White, Simon ;
Zadissa, Amonida ;
Flicek, Paul ;
Searle, Stephen M. J. .
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2016,
[2]   Progressive Cactus is a multiple-genome aligner for the thousand-genome era [J].
Armstrong, Joel ;
Hickey, Glenn ;
Diekhans, Mark ;
Fiddes, Ian T. ;
Novak, Adam M. ;
Deran, Alden ;
Fang, Qi ;
Xie, Duo ;
Feng, Shaohong ;
Stiller, Josefin ;
Genereux, Diane ;
Johnson, Jeremy ;
Marinescu, Voichita Dana ;
Alfoldi, Jessica ;
Harris, Robert S. ;
Lindblad-Toh, Kerstin ;
Haussler, David ;
Karlsson, Elinor ;
Jarvis, Erich D. ;
Zhang, Guojie ;
Paten, Benedict .
NATURE, 2020, 587 (7833) :246-+
[3]   Expert curation of the human and mouse olfactory receptor gene repertoires identifies conserved coding regions split across two exons [J].
Barnes, If H. A. ;
Ibarra-Soria, Ximena ;
Fitzgerald, Stephen ;
Gonzalez, Jose M. ;
Davidson, Claire ;
Hardy, Matthew P. ;
Manthravadi, Deepa ;
Van Gerven, Laura ;
Jorissen, Mark ;
Zeng, Zhen ;
Khan, Mona ;
Mombaerts, Peter ;
Harrow, Jennifer ;
Logan, Darren W. ;
Frankish, Adam .
BMC GENOMICS, 2020, 21 (01)
[4]   High-efficiency full-length cDNA cloning by biotinylated CAP trapper [J].
Carninci, P ;
Kvam, C ;
Kitamura, A ;
Ohsumi, T ;
Okazaki, Y ;
Itoh, M ;
Kamiya, M ;
Shibata, K ;
Sasaki, N ;
Izawa, M ;
Muramatsu, M ;
Hayashizaki, Y ;
Schneider, C .
GENOMICS, 1996, 37 (03) :327-336
[5]   Locus Reference Genomic sequences: an improved basis for describing human DNA variants [J].
Dalgleish, Raymond ;
Flicek, Paul ;
Cunningham, Fiona ;
Astashyn, Alex ;
Tully, Raymond E. ;
Proctor, Glenn ;
Chen, Yuan ;
McLaren, William M. ;
Larsson, Pontus ;
Vaughan, Brendan W. ;
Beroud, Christophe ;
Dobson, Glen ;
Lehvaeslaiho, Heikki ;
Taschner, Peter E. M. ;
den Dunnen, Johan T. ;
Devereau, Andrew ;
Birney, Ewan ;
Brookes, Anthony J. ;
Maglott, Donna R. .
GENOME MEDICINE, 2010, 2
[6]   NONCODEV5: a comprehensive annotation database for long non-coding RNAs [J].
Fang, ShuangSang ;
Zhang, LiLi ;
Guo, JinCheng ;
Niu, YiWei ;
Wu, Yang ;
Li, Hui ;
Zhao, Lian He ;
Li, Xi Yuan ;
Teng, Xue Yi ;
Sun, XianHui ;
Sun, Liang ;
Zhang, Michael Q. ;
Chen, RunSheng ;
Zhao, Yi .
NUCLEIC ACIDS RESEARCH, 2018, 46 (D1) :D308-D314
[7]  
Gordon David E, 2020, bioRxiv, DOI 10.1101/2020.03.22.002386
[8]   GENCODE: producing a reference annotation for ENCODE [J].
Harrow, Jennifer ;
Denoeud, France ;
Frankish, Adam ;
Reymond, Alexandre ;
Chen, Chao-Kung ;
Chrast, Jacqueline ;
Lagarde, Julien ;
Gilbert, James Gr ;
Storey, Roy ;
Swarbreck, David ;
Rossier, Colette ;
Ucla, Catherine ;
Hubbard, Tim ;
Antonarakis, Stylianos E. ;
Guigo, Roderic .
GENOME BIOLOGY, 2006, 7 (Suppl 1)
[9]   GENCODE: The reference human genome annotation for The ENCODE Project [J].
Harrow, Jennifer ;
Frankish, Adam ;
Gonzalez, Jose M. ;
Tapanari, Electra ;
Diekhans, Mark ;
Kokocinski, Felix ;
Aken, Bronwen L. ;
Barrell, Daniel ;
Zadissa, Amonida ;
Searle, Stephen ;
Barnes, If ;
Bignell, Alexandra ;
Boychenko, Veronika ;
Hunt, Toby ;
Kay, Mike ;
Mukherjee, Gaurab ;
Rajan, Jeena ;
Despacio-Reyes, Gloria ;
Saunders, Gary ;
Steward, Charles ;
Harte, Rachel ;
Lin, Michael ;
Howald, Cedric ;
Tanzer, Andrea ;
Derrien, Thomas ;
Chrast, Jacqueline ;
Walters, Nathalie ;
Balasubramanian, Suganthi ;
Pei, Baikang ;
Tress, Michael ;
Manuel Rodriguez, Jose ;
Ezkurdia, Iakes ;
van Baren, Jeltje ;
Brent, Michael ;
Haussler, David ;
Kellis, Manolis ;
Valencia, Alfonso ;
Reymond, Alexandre ;
Gerstein, Mark ;
Guigo, Roderic ;
Hubbard, Tim J. .
GENOME RESEARCH, 2012, 22 (09) :1760-1774
[10]   An atlas of human long non-coding RNAs with accurate 5′ ends [J].
Hon, Chung-Chau ;
Ramilowski, Jordan A. ;
Harshbarger, Jayson ;
Bertin, Nicolas ;
Rackham, Owen J. L. ;
Gough, Julian ;
Denisenko, Elena ;
Schmeier, Sebastian ;
Poulsen, Thomas M. ;
Severin, Jessica ;
Lizio, Marina ;
Kawaji, Hideya ;
Kasukawa, Takeya ;
Itoh, Masayoshi ;
Burroughs, A. Maxwell ;
Noma, Shohei ;
Djebali, Sarah ;
Alam, Tanvir ;
Medvedeva, Yulia A. ;
Testa, Alison C. ;
Lipovich, Leonard ;
Yip, Chi-Wai ;
Abugessaisa, Imad ;
Mendez, Mickael ;
Hasegawa, Akira ;
Tang, Dave ;
Lassmann, Timo ;
Heutink, Peter ;
Babina, Magda ;
Wells, Christine A. ;
Kojima, Soichi ;
Nakamura, Yukio ;
Suzuki, Harukazu ;
Daub, Carsten O. ;
de Hoon, Michiel J. L. ;
Arner, Erik ;
Hayashizaki, Yoshihide ;
Carninci, Piero ;
Forrest, Alistair R. R. .
NATURE, 2017, 543 (7644) :199-+