A bi-Poisson model for clustering gene expression profiles by RNA-seq

被引:6
|
作者
Wang, Ningtao [1 ]
Wang, Yaqun [1 ]
Hao, Han [1 ]
Wang, Luojun [1 ]
Wang, Zhong
Wang, Jianxin [2 ]
Wu, Rongling [1 ,3 ,4 ]
机构
[1] Penn State Univ, Hershey, PA 17033 USA
[2] Beijing Forestry Univ, Beijing, Peoples R China
[3] Penn State Univ, Ctr Stat Genet, Hershey, PA 17033 USA
[4] Beijing Forestry Univ, Ctr Computat Biol, Beijing, Peoples R China
关键词
RNA-seq; Poisson distribution; EM algorithm; breast cancer cell lines; DIFFERENTIAL EXPRESSION; TRANSCRIPTION FACTORS; DYNAMICS;
D O I
10.1093/bib/bbt029
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
With the availability of gene expression data by RNA-seq, powerful statistical approaches for grouping similar gene expression profiles across different environments have become increasingly important. We describe and assess a computational model for clustering genes into distinct groups based on the pattern of gene expression in response to changing environment. The model capitalizes on the Poisson distribution to capture the count property of RNA-seq data. A two-stage hierarchical expectation-maximization (EM) algorithm is implemented to estimate an optimal number of groups and mean expression amounts of each group across two environments. A procedure is formulated to test whether and how a given group shows a plastic response to environmental changes. The impact of gene-environment interactions on the phenotypic plasticity of the organism can also be visualized and characterized. The model was used to analyse an RNA-seq dataset measured from two cell lines of breast cancer that respond differently to an anti-cancer drug, from which genes associated with the resistance and sensitivity of the cell lines are identified. We performed simulation studies to validate the statistical behaviour of the model. The model provides a useful tool for clustering gene expression data by RNA-seq, facilitating our understanding of gene functions and networks.
引用
收藏
页码:534 / 541
页数:8
相关论文
共 50 条
  • [1] Differential expression analysis using a model-based gene clustering algorithm for RNA-seq data
    Osabe, Takayuki
    Shimizu, Kentaro
    Kadota, Koji
    BMC BIOINFORMATICS, 2021, 22 (01)
  • [2] Differential expression analysis using a model-based gene clustering algorithm for RNA-seq data
    Takayuki Osabe
    Kentaro Shimizu
    Koji Kadota
    BMC Bioinformatics, 22
  • [3] A multi-Poisson dynamic mixture model to cluster developmental patterns of gene expression by RNA-seq
    Ye, Meixia
    Wang, Zhong
    Wang, Yaqun
    Wu, Rongling
    BRIEFINGS IN BIOINFORMATICS, 2015, 16 (02) : 205 - 215
  • [4] Comparison of the Gene Expression Profiles Between Smokers With and Without Lung Cancer Using RNA-Seq
    Cheng, Peng
    Cheng, You
    Li, Yan
    Zhao, Zhenguo
    Gao, Hui
    Li, Dong
    Li, Hua
    Zhang, Tao
    ASIAN PACIFIC JOURNAL OF CANCER PREVENTION, 2012, 13 (08) : 3605 - 3609
  • [5] A Two-Stage Poisson Model for Testing RNA-Seq Data
    Auer, Paul L.
    Doerge, Rebecca W.
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2011, 10 (01)
  • [6] An RNA-Seq based gene expression atlas of the common bean
    O'Rourke, Jamie A.
    Iniguez, Luis P.
    Fu, Fengli
    Bucciarelli, Bruna
    Miller, Susan S.
    Jackson, Scott A.
    McClean, Philip E.
    Li, Jun
    Dai, Xinbin
    Zhao, Patrick X.
    Hernandez, Georgina
    Vance, Carroll P.
    BMC GENOMICS, 2014, 15
  • [7] scRNASeqDB: A Database for RNA-Seq Based Gene Expression Profiles in Human Single Cells
    Cao, Yuan
    Zhu, Junjie
    Jia, Peilin
    Zhao, Zhongming
    GENES, 2017, 8 (12)
  • [8] A model based criterion for gene expression calls using RNA-seq data
    Wagner, Guenter P.
    Kin, Koryu
    Lynch, Vincent J.
    THEORY IN BIOSCIENCES, 2013, 132 (03) : 159 - 164
  • [9] A model based criterion for gene expression calls using RNA-seq data
    Günter P. Wagner
    Koryu Kin
    Vincent J. Lynch
    Theory in Biosciences, 2013, 132 : 159 - 164
  • [10] Robustness of differential gene expression analysis of RNA-seq
    Stupnikov, A.
    McInerney, C. E.
    Savage, K. I.
    McIntosh, S. A.
    Emmert-Streib, F.
    Kennedy, R.
    Salto-Tellez, M.
    Prise, K. M.
    McArt, D. G.
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2021, 19 : 3470 - 3481