Mammalian Annotation Database for improved annotation and functional classification of Omics datasets from less well-annotated organisms

被引:13
作者
Bick, Jochen T. [1 ]
Zeng, Shuqin [1 ,2 ]
Robinson, Mark D. [3 ,4 ]
Ulbrich, Susanne E. [1 ]
Bauersachs, Stefan [1 ,2 ]
机构
[1] Swiss Fed Inst Technol, Inst Agr Sci, Anim Physiol, Zurich, Switzerland
[2] Univ Zurich, Vetsuisse Fac Zurich, Genet & Funct Genom, Zurich, Switzerland
[3] Univ Zurich, Inst Mol Life Sci, Zurich, Switzerland
[4] Univ Zurich, SIB, Zurich, Switzerland
来源
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION | 2019年
基金
瑞士国家科学基金会;
关键词
GENE; BIOINFORMATICS; BIOCONDUCTOR;
D O I
10.1093/database/baz086
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Next-generation sequencing technologies and the availability of an increasing number of mammalian and other genomes allow gene expression studies, particularly RNA sequencing, in many non-model organisms. However, incomplete genome annotation and assignments of genes to functional annotation databases can lead to a substantial loss of information in downstream data analysis. To overcome this, we developed Mammalian Annotation Database tool (MAdb, https://madb.ethz.ch) to conveniently provide homologous gene information for selected mammalian species. The assignment between species is performed in three steps: (i) matching official gene symbols, (ii) using ortholog information contained in Ensembl Compara and (iii) pairwise BLAST comparisons of all transcripts. In addition, we developed a new tool (AnnOverlappeR) for the reliable assignment of the National Center for Biotechnology Information (NCBI) and Ensembl gene IDs. The gene lists translated to gene IDs of well-annotated species such as a human can be used for improved functional annotation with relevant tools based on Gene Ontology and molecular pathway information. We tested the MAdb on a published RNA-seq data set for the pig and showed clearly improved overrepresentation analysis results based on the assigned human homologous gene identifiers. Using the MAdb revealed a similar list of human homologous genes and functional annotation results regardless of whether starting with gene IDs from NCBI or Ensembl. The MAdb database is accessible via a web interface and a Galaxy application.
引用
收藏
页数:16
相关论文
共 49 条
[1]   The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update [J].
Afgan, Enis ;
Baker, Dannon ;
van den Beek, Marius ;
Blankenberg, Daniel ;
Bouvier, Dave ;
Cech, Martin ;
Chilton, John ;
Clements, Dave ;
Coraor, Nate ;
Eberhard, Carl ;
Gruening, Bjoern ;
Guerler, Aysam ;
Hillman-Jackson, Jennifer ;
Von Kuster, Greg ;
Rasche, Eric ;
Soranzo, Nicola ;
Turaga, Nitesh ;
Taylor, James ;
Nekrutenko, Anton ;
Goecks, Jeremy .
NUCLEIC ACIDS RESEARCH, 2016, 44 (W1) :W3-W10
[2]  
Agarwala R, 2016, NUCLEIC ACIDS RES, V44, pD7, DOI [10.1093/nar/gkv1290, 10.1093/nar/gku1130]
[3]   The OMA orthology database in 2018: retrieving evolutionary relationships among all domains of life through richer web and programmatic interfaces [J].
Altenhoff, Adrian M. ;
Glover, Natasha M. ;
Train, Clement-Marie ;
Kaleb, Klara ;
Vesztrocy, Alex Warwick ;
Dylus, David ;
de Farias, Tarcisio M. ;
Zile, Karina ;
Stevenson, Charles ;
Long, Jiao ;
Redestig, Henning ;
Gonnet, Gaston H. ;
Dessimoz, Christophe .
NUCLEIC ACIDS RESEARCH, 2018, 46 (D1) :D477-D485
[4]  
Altenhoff AM, 2016, NAT METHODS, V13, P425, DOI [10.1038/NMETH.3830, 10.1038/nmeth.3830]
[5]   Resolving the Ortholog Conjecture: Orthologs Tend to Be Weakly, but Significantly, More Similar in Function than Paralogs [J].
Altenhoff, Adrian M. ;
Studer, Romain A. ;
Robinson-Rechavi, Marc ;
Dessimoz, Christophe .
PLOS COMPUTATIONAL BIOLOGY, 2012, 8 (05)
[6]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[7]  
[Anonymous], INT J MOL SCI
[8]  
[Anonymous], 2003 HUM GEN PROJ CO
[9]  
[Anonymous], 2003, GENOME BIOL
[10]  
Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkw1099, 10.1093/nar/gkh131]