Tissue enrichment analysis for C. elegans genomics

被引:111
作者
Angeles-Albores, David
Lee, Raymond Y. N.
Chan, Juancarlos
Sternberg, Paul W. [1 ]
机构
[1] HHMI, 1200 E Calif Blvd, Pasadena, CA 91125 USA
来源
BMC BIOINFORMATICS | 2016年 / 17卷
关键词
Gene ontology; Anatomy ontology; WormBase; RNA-seq; High-throughput biology; EXPRESSION; ONTOLOGY; IDENTIFICATION; GENES; CELL;
D O I
10.1186/s12859-016-1229-9
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Over the last ten years, there has been explosive development in methods for measuring gene expression. These methods can identify thousands of genes altered between conditions, but understanding these datasets and forming hypotheses based on them remains challenging. One way to analyze these datasets is to associate ontologies (hierarchical, descriptive vocabularies with controlled relations between terms) with genes and to look for enrichment of specific terms. Although Gene Ontology (GO) is available for Caenorhabditis elegans, it does not include anatomical information. Results: We have developed a tool for identifying enrichment of C. elegans tissues among gene sets and generated a website GUI where users can access this tool. Since a common drawback to ontology enrichment analyses is its verbosity, we developed a very simple filtering algorithm to reduce the ontology size by an order of magnitude. We adjusted these filters and validated our tool using a set of 30 gold standards from Expression Cluster data in WormBase. We show our tool can even discriminate between embryonic and larval tissues and can even identify tissues down to the single-cell level. We used our tool to identify multiple neuronal tissues that are down-regulated due to pathogen infection in C. elegans. Conclusions: Our Tissue Enrichment Analysis (TEA) can be found within WormBase, and can be downloaded using Python's standard pip installer. It tests a slimmed-down C. elegans tissue ontology for enrichment of specific terms and provides users with a text and graphic representation of the results.
引用
收藏
页数:10
相关论文
共 34 条
[11]   DAVID Bioinformatics Resources: expanded annotation database and novel algorithms to better extract biology from large gene lists [J].
Huang, Da Wei ;
Sherman, Brad T. ;
Tan, Qina ;
Kir, Joseph ;
Liu, David ;
Bryant, David ;
Guo, Yongjian ;
Stephens, Robert ;
Baseler, Michael W. ;
Lane, H. Clifford ;
Lempicki, Richard A. .
NUCLEIC ACIDS RESEARCH, 2007, 35 :W169-W175
[12]   Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources [J].
Huang, Da Wei ;
Sherman, Brad T. ;
Lempicki, Richard A. .
NATURE PROTOCOLS, 2009, 4 (01) :44-57
[13]  
Kelly WG, 1997, GENETICS, V146, P227
[14]   Development of sustainability network theory (SNT) and model for managing electronics industrial system [J].
Kim, Junbeum ;
Allenby, Braden .
PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL SYMPOSIUM ON ELECTRONICS & THE ENVIRONMENT, CONFERENCE RECORD, 2007, :170-+
[15]   Building a cell and anatomy ontology of Caenorhabditis elegans [J].
Lee, RYN ;
Sternberg, PW .
COMPARATIVE AND FUNCTIONAL GENOMICS, 2003, 4 (01) :121-126
[16]   Ontology-aware classification of tissue and cell-type signals in gene expression profiles across platforms and technologies [J].
Lee, Young-suk ;
Krishnan, Arjun ;
Zhu, Qian ;
Troyanskaya, Olga G. .
BIOINFORMATICS, 2013, 29 (23) :3036-3044
[17]   Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 [J].
Love, Michael I. ;
Huber, Wolfgang ;
Anders, Simon .
GENOME BIOLOGY, 2014, 15 (12)
[18]  
McKinney W., 2011, PYTHON HIGH PERFORMA, V14, P1
[19]   GREAT improves functional interpretation of cis-regulatory regions [J].
McLean, Cory Y. ;
Bristor, Dave ;
Hiller, Michael ;
Clarke, Shoa L. ;
Schaar, Bruce T. ;
Lowe, Craig B. ;
Wenger, Aaron M. ;
Bejerano, Gill .
NATURE BIOTECHNOLOGY, 2010, 28 (05) :495-U155
[20]   Behavioral avoidance of pathogenic bacteria by Caenorhabditis elegans [J].
Meisel, Joshua D. ;
Kim, Dennis H. .
TRENDS IN IMMUNOLOGY, 2014, 35 (10) :465-470