PairGP: Gaussian process modeling of longitudinal data from paired multi-condition studies

被引:2
作者
Vantini, Michele [1 ]
Mannerstrom, Henrik [1 ]
Rautio, Sini [1 ]
Ahlfors, Helena [2 ]
Stockinger, Brigitta [2 ]
Lahdesmaki, Harri [1 ]
机构
[1] Aalto Univ, Dept Comp Sci, Konemiehentie, Espoo 02150, Finland
[2] Francis Crick Inst, 1 Midland Rd, London NW1 1AT, England
基金
芬兰科学院;
关键词
Gaussian processes; Gene expressions; Time-series; Pairing effect; Differential condition analysis; GENE-EXPRESSION;
D O I
10.1016/j.compbiomed.2022.105268
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
High-throughput technologies produce gene expression time-series data that need fast and specialized algorithms to be processed. While current methods already deal with different aspects, such as the non-stationarity of the process and the temporal correlation, they often fail to take into account the pairing among replicates. We propose PairGP, a non-stationary Gaussian process method to compare gene expression time-series across several conditions that can account for paired longitudinal study designs and can identify groups of conditions that have different gene expression dynamics. We demonstrate the method on both simulated data and previously unpublished RNA sequencing (RNA-seq) time-series with five conditions. The results show the advantage of modeling the pairing effect to better identify groups of conditions with different dynamics. The pairing effect model displays good capabilities of selecting the most probable grouping of conditions even in the presence of a high number of conditions. The developed method is of general application and can be applied to any gene expression time series dataset. The model can identify common replicate effects among the samples coming from the same biological replicates and model those as separate components. Learning the pairing effect as a separate component, not only allows us to exclude it from the model to get better estimates of the condition effects, but also to improve the precision of the model selection process. The pairing effect that was accounted before as noise, is now identified as a separate component, resulting in more accurate and explanatory models of the data.
引用
收藏
页数:7
相关论文
共 26 条
  • [1] Methods for time series analysis of RNA-seq data with application to human Th17 cell differentiation
    Aijo, Tarmo
    Butty, Vincent
    Chen, Zhi
    Salo, Verna
    Tripathi, Subhash
    Burge, Christopher B.
    Lahesmaa, Riitta
    Lahdesmaki, Harri
    [J]. BIOINFORMATICS, 2014, 30 (12) : 113 - 120
  • [2] An integrative computational systems biology approach identifies differentially regulated dynamic transcriptome signatures which drive the initiation of human T helper cell differentiation
    Aijo, Tarmo
    Edelman, Sanna M.
    Lonnberg, Tapio
    Larjo, Antti
    Kallionpaa, Henna
    Tuomela, Soile
    Engstrom, Emilia
    Lahesmaa, Riitta
    Lahdesmaki, Harri
    [J]. BMC GENOMICS, 2012, 13
  • [3] Anders S., 2010, GENOME BIOL, V11, pR106, DOI [10.1186/gb-2010-11-10-r106, DOI 10.1186/gb-2010-11-10-r106]
  • [4] Angelini C, 2007, STAT APPL GENET MOL, V6
  • [5] BATS: a Bayesian user-friendly software for Analyzing Time Series microarray experiments
    Angelini, Claudia
    Cutillo, Luisa
    De Canditiis, Daniela
    Mutarelli, Margherita
    Pensky, Marianna
    [J]. BMC BIOINFORMATICS, 2008, 9 (1)
  • [6] Fitting Linear Mixed-Effects Models Using lme4
    Bates, Douglas
    Maechler, Martin
    Bolker, Benjamin M.
    Walker, Steven C.
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2015, 67 (01): : 1 - 48
  • [7] A LIMITED MEMORY ALGORITHM FOR BOUND CONSTRAINED OPTIMIZATION
    BYRD, RH
    LU, PH
    NOCEDAL, J
    ZHU, CY
    [J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 1995, 16 (05) : 1190 - 1208
  • [8] An additive Gaussian process regression model for interpretable non-parametric analysis of longitudinal data
    Cheng, Lu
    Ramchandran, Siddharth
    Vatanen, Tommi
    Lietzen, Niina
    Lahesmaa, Riitta
    Vehtari, Aki
    Lahdesmaki, Harri
    [J]. NATURE COMMUNICATIONS, 2019, 10 (1)
  • [9] Genome-wide Profiling of Interleukin-4 and STAT6 Transcription Factor Regulation of Human Th2 Cell Programming
    Elo, Laura L.
    Jarvenpaa, Henna
    Tuomela, Soile
    Raghav, Sunil
    Ahlfors, Helena
    Laurila, Kirsti
    Gupta, Bhawna
    Lund, Riikka J.
    Tahvanainen, Johanna
    Hawkins, R. David
    Oresic, Matej
    Lahdesmaki, Harri
    Rasool, Omid
    Rao, Kanury V.
    Aittokallio, Tero
    Lahesmaa, Riitta
    [J]. IMMUNITY, 2010, 32 (06) : 852 - 862
  • [10] Clustering short time series gene expression data
    Ernst, J
    Nau, GJ
    Bar-Joseph, Z
    [J]. BIOINFORMATICS, 2005, 21 : I159 - I168