MULTISCALE POISSON PROCESS APPROACHES FOR DETECTING AND ESTIMATING DIFFERENCES FROM HIGH-THROUGHPUT SEQUENCING ASSAYS

被引:0
|
作者
Shim, Heejung [1 ]
Xing, Zhengrong [2 ]
Pantaleo, Ester [2 ]
Luca, Francesca [3 ,4 ]
Pique-Regi, Roger [4 ,5 ]
Stephens, Matthew [6 ]
机构
[1] Univ Melbourne, Sch Math & Stat & Melbourne Integrat Genom, Melbourne, Australia
[2] Univ Chicago, Dept Stat, Chicago, IL 60637 USA
[3] Wayne State Univ, Dept Obstet & Gynecol, Detroit, MI USA
[4] Wayne State Univ, Ctr Mol Med & Genet, Detroit, MI USA
[5] Wayne State Univ, Ctr Mol Med & Genet, Detroit, MI USA
[6] Univ Chicago, Dept Stat, Chicago, IL 60637 USA
关键词
Multiscale Poisson processes; wavelets; differential expression analysis; high- throughput sequencing assays; high-resolution; Bayesian inference; functional data; count data; RNA-seq; DNase-; seq; ATAC-seq; chromatin accessibility; RNA-SEQ; EXPRESSION ANALYSIS; OPEN CHROMATIN; IN-VIVO; ASSOCIATION;
D O I
10.1214/23-AOAS1828
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Estimating and testing for differences in molecular phenotypes (e.g., gene expression, chromatin accessibility, transcription factor binding) across conditions is an important part of understanding the molecular basis of gene regulation. These phenotypes are commonly measured using high-throughput high-resolution count data that reflect how the phenotypes vary along the genome. Multiple methods have been proposed to help exploit these highresolution measurements for differential expression analysis. However, they ignore the count nature of the data, instead using normal distributions that work well only for data with large sample sizes or high counts. Here we develop count-based methods to address this problem. We model the data for each sample using an inhomogeneous Poisson process with spatially structured underlying intensity function and then, building on multiscale models for the Poisson process, estimate and test for differences in the underlying intensity function across samples (or groups of samples). Using both simulation and real ATAC-seq data, we show that our method outperforms previous normal-based methods, especially in situations with small sample sizes or low counts.
引用
收藏
页码:1773 / 1788
页数:16
相关论文
共 50 条
  • [21] Discovery of genes associated with cadmium accumulation from gill of scallop Chlamys farreri based on high-throughput sequencing
    Zhang, Hui
    Zhai, Yuxiu
    Yao, Lin
    Jiang, Yanhua
    Li, Fengling
    GENES & GENOMICS, 2016, 38 (05) : 439 - 445
  • [22] High-Throughput Sequencing Analysis of Small RNAs Derived from Coleus Blumei Viroids
    Jiang, Dong-Mei
    Wang, Meng
    Li, Shi-Fang
    Zhang, Zhi-Xiang
    VIRUSES-BASEL, 2019, 11 (07):
  • [23] Control of translation by eukaryoticmRNAtranscript leaders-Insights from high-throughput assays and computational modeling
    Akirtava, Christina
    McManus, Charles Joel
    WILEY INTERDISCIPLINARY REVIEWS-RNA, 2021, 12 (03)
  • [24] NucTools: analysis of chromatin feature occupancy profiles from high-throughput sequencing data
    Vainshtein, Yevhen
    Rippe, Karsten
    Teif, Vladimir B.
    BMC GENOMICS, 2017, 18
  • [25] Discovery of sex-related genes through high-throughput transcriptome sequencing from the salmon louse Caligus rogercresseyi
    Farlora, Rodolfo
    Araya-Garay, Jose
    Gallardo-Escarate, Cristian
    MARINE GENOMICS, 2014, 15 : 85 - 93
  • [26] Genetic variants in fat- and short-tailed sheep from high-throughput RNA-sequencing data
    Ma, L.
    Li, Z.
    Cai, Y.
    Xu, H.
    Yang, R.
    Lan, X.
    ANIMAL GENETICS, 2018, 49 (05) : 483 - 487
  • [27] Low-cost and High-throughput RNA-seq Library Preparation for Illumina Sequencing from Plant Tissue
    Bjornson, Marta
    Kajala, Kaisa
    Zipfel, Cyril
    Ding, Pingtao
    BIO-PROTOCOL, 2020, 10 (20):
  • [28] Allelome.PRO, a pipeline to define allele-specific genomic features from high-throughput sequencing data
    Andergassen, Daniel
    Dotter, Christoph P.
    Kulinski, Tomasz M.
    Guenzl, Philipp M.
    Bammer, Philipp C.
    Barlow, Denise P.
    Pauler, Florian M.
    Hudson, Quanah J.
    NUCLEIC ACIDS RESEARCH, 2015, 43 (21)
  • [29] Sharing of photobionts in sympatric populations of Thamnolia and Cetraria lichens: evidence from high-throughput sequencing
    Onut-Brannstrom, Ioana
    Benjamin, Mitchell
    Scofield, Douglas G.
    Heidmarsson, Starri
    Andersson, Martin G. I.
    Lindstrom, Eva S.
    Johannesson, Hanna
    SCIENTIFIC REPORTS, 2018, 8
  • [30] Discovery and molecular characterization of a new cryptovirus dsRNA genome from Japanese persimmon through conventional cloning and high-throughput sequencing
    Morelli, M.
    Chiumenti, M.
    De Stradis, A.
    La Notte, P.
    Minafra, A.
    VIRUS GENES, 2015, 50 (01) : 160 - 164