A multi-Poisson dynamic mixture model to cluster developmental patterns of gene expression by RNA-seq

被引:3
作者
Ye, Meixia [1 ]
Wang, Zhong [1 ]
Wang, Yaqun [2 ]
Wu, Rongling [1 ,3 ]
机构
[1] Beijing Forestry Univ, Ctr Computat Biol, Beijing 100083, Peoples R China
[2] Penn State Univ, Dept Stat, University Pk, PA 16802 USA
[3] Penn State Univ, University Pk, PA 16802 USA
关键词
multivariate poisson; RNA-seq; gene expression; gene cluster; mixture model; OVERDISPERSED COUNT DATA; PROFILES; NETWORKS;
D O I
10.1093/bib/bbu013
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Dynamic changes of gene expression reflect an intrinsic mechanism of how an organism responds to developmental and environmental signals. With the increasing availability of expression data across a time-space scale by RNAseq, the classification of genes as per their biological function using RNA-seq data has become one of the most significant challenges in contemporary biology. Here we develop a clustering mixture model to discover distinct groups of genes expressed during a period of organ development. By integrating the density function of multivariate Poisson distribution, the model accommodates the discrete property of read counts characteristic of RNA-seq data. The temporal dependence of gene expression is modeled by the first-order autoregressive process. The model is implemented with the Expectation-Maximization algorithm and model selection to determine the optimal number of gene clusters and obtain the estimates of Poisson parameters that describe the pattern of time-dependent expression of genes from each cluster. The model has been demonstrated by analyzing a real data from an experiment aimed to link the pattern of gene expression to catkin development in white poplar. The usefulness of the model has been validated through computer simulation. The model provides a valuable tool for clustering RNA-seq data, facilitating our global view of expression dynamics and understanding of gene regulation mechanisms.
引用
收藏
页码:205 / 215
页数:11
相关论文
共 37 条
[1]   Differential expression analysis for sequence count data [J].
Anders, Simon ;
Huber, Wolfgang .
GENOME BIOLOGY, 2010, 11 (10)
[2]   A multivariate Poisson mixture model for marketing applications [J].
Brijs, T ;
Karlis, D ;
Swinnen, G ;
Vanhoof, K ;
Wets, G ;
Manchanda, P .
STATISTICA NEERLANDICA, 2004, 58 (03) :322-348
[3]   Hierarchical cluster analysis of immunophenotype classify AML patients with NPM1 gene mutation into two groups with distinct prognosis [J].
Chen, Chien-Yuan ;
Chou, Wen-Chien ;
Tsay, Woei ;
Tang, Jih-Luh ;
Yao, Ming ;
Huang, Sheng-Yi ;
Tien, Hwei-Fang .
BMC CANCER, 2013, 13
[4]   Abundance modelling of invasive and indigenous Culicoides species in Spain [J].
Ducheyne, Els ;
Miranda Chueca, Miguel A. ;
Lucientes, Javier ;
Calvete, Carlos ;
Estrada, Rosa ;
Boender, Gert-Jan ;
Goossens, Els ;
De Clercq, Eva M. ;
Hendrickx, Guy .
GEOSPATIAL HEALTH, 2013, 8 (01) :241-254
[5]   Cluster analysis and display of genome-wide expression patterns [J].
Eisen, MB ;
Spellman, PT ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) :14863-14868
[6]   A flexible count data model to fit the wide diversity of expression profiles arising from extensively replicated RNA-seq experiments [J].
Esnaola, Mikel ;
Puig, Pedro ;
Gonzalez, David ;
Castelo, Robert ;
Gonzalez, Juan R. .
BMC BIOINFORMATICS, 2013, 14
[7]   Analysis of overdispersed count data by mixtures of Poisson variables and Poisson processes [J].
Hougaard, P ;
Lee, MLT ;
Whitmore, GA .
BIOMETRICS, 1997, 53 (04) :1225-1238
[8]   Composite transcriptome assembly of RNA-seq data in a sheep model for delayed bone healing [J].
Jaeger, Marten ;
Ott, Claus-Eric ;
Gruenhagen, Johannes ;
Hecht, Jochen ;
Schell, Hanna ;
Mundlos, Stefan ;
Duda, Georg N. ;
Robinson, Peter N. ;
Lienau, Jasmin .
BMC GENOMICS, 2011, 12
[9]   Finite mixtures of multivariate Poisson distributions with application [J].
Karlis, Dimitris ;
Meligkotsidou, Loukia .
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2007, 137 (06) :1942-1960
[10]   A Computational Approach to the Functional Clustering of Periodic Gene-Expression Profiles [J].
Kim, Bong-Rae ;
Zhang, Li ;
Berg, Arthur ;
Fan, Jianqing ;
Wu, Rongling .
GENETICS, 2008, 180 (02) :821-834