Identification of metagenes and their Interactions through Large-scale Analysis of Arabidopsis Gene Expression Data

被引:9
|
作者
Wilson, Tyler J. [1 ]
Lai, Liming [1 ]
Ban, Yuguang [1 ]
Ge, Steven X. [1 ]
机构
[1] S Dakota State Univ, Dept Math & Stat, Brookings, SD 57007 USA
来源
BMC GENOMICS | 2012年 / 13卷
基金
美国国家卫生研究院;
关键词
BIOINFORMATICS; DISCOVERY; NETWORK; REVEALS; BIOLOGY; TOOLS; LISTS;
D O I
10.1186/1471-2164-13-237
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Many plant genes have been identified through whole genome and deep transcriptome sequencing and other methods; yet our knowledge on the function of many of these genes remains limited. The integration and analysis of large gene-expression datasets gives researchers the ability to formalize hypotheses concerning the functionality and interaction between different groups of correlated genes. Results: We applied the non-negative matrix factorization (NMF) algorithm to the AtGenExpress dataset which consists of 783 microarray samples (29 separate experimental series) conducted on the model plant Arabidopsis thaliana. We identified 15 metagenes, which are groups of genes with correlated expression. Functional roles of these metagenes are established by observing the enriched gene ontology (GO) categories using gene set enrichment analyses (GSEA). Activity levels of these metagenes in various experimental conditions are also analyzed to associate metagenes with stimuli/conditions. A metagene correlation network, constructed based on the results of NMF analysis, revealed many new interactions between the metagenes. Comparison of these metagenes with an earlier large-scale clustering analysis indicates many statistically significant overlaps. Conclusions: This study identifies a network of correlated metagenes composed of Arabidopsis genes acting in a highly correlated fashion across a broad spectrum of experimental stimuli, which may shed some light on the function of many of the un-annotated genes.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Exploiting Scientific Workflows for Large-scale Gene Expression Data Analysis
    De Stasio, Alessandro
    Ertelt, Marcus
    Kemmner, Wolfgang
    Leser, Ulf
    Ceccarelli, Michele
    2009 24TH INTERNATIONAL SYMPOSIUM ON COMPUTER AND INFORMATION SCIENCES, 2009, : 447 - +
  • [2] SCANPY: large-scale single-cell gene expression data analysis
    F. Alexander Wolf
    Philipp Angerer
    Fabian J. Theis
    Genome Biology, 19
  • [3] SCANPY: large-scale single-cell gene expression data analysis
    Wolf, F. Alexander
    Angerer, Philipp
    Theis, Fabian J.
    GENOME BIOLOGY, 2018, 19
  • [4] Bagging Statistical Network Inference from Large-Scale Gene Expression Data
    Simoes, Ricardo de Matos
    Emmert-Streib, Frank
    PLOS ONE, 2012, 7 (03):
  • [5] Automated Protocol for Large-Scale Modeling of Gene Expression Data
    Hall, Michelle Lynn
    Calkins, David
    Sherman, Woody
    Journal of Chemical Information and Modeling, 2016, 56 (11) : 2216 - 2224
  • [6] Discovery of Genes Essential for Heme Biosynthesis through Large-Scale Gene Expression Analysis
    Nilsson, Roland
    Schultz, Iman J.
    Pierce, Eric L.
    Soltis, Kathleen A.
    Naranuntarat, Amornrat
    Ward, Diane M.
    Baughman, Joshua M.
    Paradkar, Prasad N.
    Kingsley, Paul D.
    Culotta, Valeria C.
    Kaplan, Jerry
    Palis, James
    Paw, Barry H.
    Mootha, Vamsi K.
    CELL METABOLISM, 2009, 10 (02) : 119 - 130
  • [7] Multi-dimensional correlations for gene coexpression and application to the large-scale data of Arabidopsis
    Kinoshita, Kengo
    Obayashi, Takeshi
    BIOINFORMATICS, 2009, 25 (20) : 2677 - 2684
  • [8] Whole-genome approaches for large-scale gene identification and expression analysis in mammalian preimplantation embryos
    Adjaye, J
    REPRODUCTION FERTILITY AND DEVELOPMENT, 2005, 17 (1-2) : 37 - 45
  • [9] paraGSEA: a scalable approach for large-scale gene expression profiling
    Peng, Shaoliang
    Yang, Shunyun
    Bo, Xiaochen
    Li, Fei
    NUCLEIC ACIDS RESEARCH, 2017, 45 (17)
  • [10] GPLEXUS: enabling genome-scale gene association network reconstruction and analysis for very large-scale expression data
    Li, Jun
    Wei, Hairong
    Liu, Tingsong
    Zhao, Patrick Xuechun
    NUCLEIC ACIDS RESEARCH, 2014, 42 (05) : e32