MULTISCALE POISSON PROCESS APPROACHES FOR DETECTING AND ESTIMATING DIFFERENCES FROM HIGH-THROUGHPUT SEQUENCING ASSAYS

被引:0
|
作者
Shim, Heejung [1 ]
Xing, Zhengrong [2 ]
Pantaleo, Ester [2 ]
Luca, Francesca [3 ,4 ]
Pique-Regi, Roger [4 ,5 ]
Stephens, Matthew [6 ]
机构
[1] Univ Melbourne, Sch Math & Stat & Melbourne Integrat Genom, Melbourne, Australia
[2] Univ Chicago, Dept Stat, Chicago, IL 60637 USA
[3] Wayne State Univ, Dept Obstet & Gynecol, Detroit, MI USA
[4] Wayne State Univ, Ctr Mol Med & Genet, Detroit, MI USA
[5] Wayne State Univ, Ctr Mol Med & Genet, Detroit, MI USA
[6] Univ Chicago, Dept Stat, Chicago, IL 60637 USA
关键词
Multiscale Poisson processes; wavelets; differential expression analysis; high- throughput sequencing assays; high-resolution; Bayesian inference; functional data; count data; RNA-seq; DNase-; seq; ATAC-seq; chromatin accessibility; RNA-SEQ; EXPRESSION ANALYSIS; OPEN CHROMATIN; IN-VIVO; ASSOCIATION;
D O I
10.1214/23-AOAS1828
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Estimating and testing for differences in molecular phenotypes (e.g., gene expression, chromatin accessibility, transcription factor binding) across conditions is an important part of understanding the molecular basis of gene regulation. These phenotypes are commonly measured using high-throughput high-resolution count data that reflect how the phenotypes vary along the genome. Multiple methods have been proposed to help exploit these highresolution measurements for differential expression analysis. However, they ignore the count nature of the data, instead using normal distributions that work well only for data with large sample sizes or high counts. Here we develop count-based methods to address this problem. We model the data for each sample using an inhomogeneous Poisson process with spatially structured underlying intensity function and then, building on multiscale models for the Poisson process, estimate and test for differences in the underlying intensity function across samples (or groups of samples). Using both simulation and real ATAC-seq data, we show that our method outperforms previous normal-based methods, especially in situations with small sample sizes or low counts.
引用
收藏
页码:1773 / 1788
页数:16
相关论文
共 50 条
  • [11] Detecting circular RNA from high-throughput sequence data with de Bruijn graph
    Xin Li
    Yufeng Wu
    BMC Genomics, 21
  • [12] Detecting circular RNA from high-throughput sequence data with de Bruijn graph
    Li, Xin
    Wu, Yufeng
    BMC GENOMICS, 2020, 21 (Suppl 1)
  • [13] A beginners guide to SNP calling from high-throughput DNA-sequencing data
    Altmann, Andre
    Weber, Peter
    Bader, Daniel
    Preuss, Michael
    Binder, Elisabeth B.
    Mueller-Myhsok, Bertram
    HUMAN GENETICS, 2012, 131 (10) : 1541 - 1554
  • [14] seekCRIT: Detecting and characterizing differentially expressed circular RNAs using high-throughput sequencing data
    Chaabane, Mohamed
    Andreeva, Kalina
    Hwang, Jae Yeon
    Kook, Tae Lim
    Park, Juw Won
    Cooper, Nigel G. F.
    PLOS COMPUTATIONAL BIOLOGY, 2020, 16 (10)
  • [15] New insights into the avian epigenome from high-throughput sequencing experiments
    Mersch, Marjorie
    David, Sarah-Anne
    Vitorino Carvalho, Anais
    Foissac, Sylvain
    Collin, Anne
    Pitel, Frederique
    Coustham, Vincent
    INRA PRODUCTIONS ANIMALES, 2018, 31 (04): : 325 - 335
  • [16] A high-throughput pipeline for DNA/RNA/small RNA purification from tissue samples for sequencing
    Xu, Jing
    Pandoh, Pawan K.
    Corbett, Richard D.
    Smailus, Duane
    Bowlby, Reanne
    Brooks, Denise
    McDonald, Helen
    Haile, Simon
    Chahal, Sundeep
    Bilobram, Steve
    Mungall, Karen L.
    Mungall, Andrew J.
    Coope, Robin
    Moore, Richard A.
    Zhao, Yongjun
    Jones, Steven J. M.
    Marra, Marco A.
    BIOTECHNIQUES, 2023, 75 (02) : 47 - 55
  • [17] Identifying small interfering RNA loci from high-throughput sequencing data
    Hardcastle, Thomas J.
    Kelly, Krystyna A.
    Baulcombe, David C.
    BIOINFORMATICS, 2012, 28 (04) : 457 - 463
  • [18] ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data
    Wang, Kai
    Li, Mingyao
    Hakonarson, Hakon
    NUCLEIC ACIDS RESEARCH, 2010, 38 (16) : e164
  • [19] Characterization of Six Ampeloviruses Infecting Pineapple in Reunion Island Using a Combination of High-Throughput Sequencing Approaches
    Masse, Delphine
    Candresse, Thierry
    Filloux, Denis
    Massart, Sebastien
    Cassam, Nathalie
    Hostachy, Bruno
    Marais, Armelle
    Fernandez, Emmanuel
    Roumagnac, Philippe
    Verdin, Eric
    Teycheney, Pierre-Yves
    Lett, Jean-Michel
    Lefeuvre, Pierre
    VIRUSES-BASEL, 2024, 16 (07):
  • [20] Old Trade, New Tricks: Insights into the Spontaneous Mutation Process from the Partnering of Classical Mutation Accumulation Experiments with High-Throughput Genomic Approaches
    Katju, Vaishali
    Bergthorsson, Ulfar
    GENOME BIOLOGY AND EVOLUTION, 2019, 11 (01): : 136 - 165