Pathway-based genetic association analysis for overdispersed count data

被引:0
|
作者
Liu, Yang [1 ]
机构
[1] Wright State Univ, Dept Math & Stat, 3640 Colonel Glenn Hwy, Dayton, OH 45435 USA
基金
美国国家卫生研究院;
关键词
Overdispersion; association analysis; negative binomial regression; mixed effects; somatic mutations; DIFFERENTIAL EXPRESSION ANALYSIS; RARE-VARIANT ASSOCIATION; TESTS;
D O I
10.1080/02664763.2025.2460073
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Overdispersion is a common phenomenon in genetic data, such as gene expression count data. In genetic association studies, it is important to investigate the association between a gene expression and a set of genetic variants from a pathway. However, existing approaches for pathway analysis are primarily designed for continuous and binary outcomes and are not applicable to overdispersed count data. In this paper, we propose a hierarchical approach to analyze the association between an overdispersed count response and a set of low-frequency genetic variants in negative binomial regression. We derive score-type test statistics for both fixed and random effects of genetic variants, and further introduce a novel procedure for efficiently combining these two statistics for global testing. Through simulation studies, we demonstrate that the proposed method tends to be more powerful than existing methods under a wide range of scenarios. Additionally, we apply the proposed method to a colorectal cancer study, demonstrating its power in identifying associations between gene expression and somatic mutations.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Saddlepoint tests for accurate and robust inference on overdispersed count data
    Aeberhard, William H.
    Cantoni, Eva
    Heritier, Stephane
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2017, 107 : 162 - 175
  • [22] Mean and Variance Modeling of Under- and Overdispersed Count Data
    Smith, David M.
    Faddy, Malcolm J.
    JOURNAL OF STATISTICAL SOFTWARE, 2016, 69 (06): : 1 - 23
  • [23] A hyper-Poisson regression model for overdispersed and underdispersed count data
    Saez-Castillo, A. J.
    Conde-Sanchez, A.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2013, 61 : 148 - 157
  • [24] Count Regression and Machine Learning Techniques for Zero-Inflated Overdispersed Count Data: Application to Ecological Data
    Sidumo B.
    Sonono E.
    Takaidza I.
    Annals of Data Science, 2024, 11 (03) : 803 - 817
  • [25] A marginalized model for zero-inflated, overdispersed and correlated count data
    Iddia, Samuel
    Molenberghs, Geert
    ELECTRONIC JOURNAL OF APPLIED STATISTICAL ANALYSIS, 2013, 6 (02) : 149 - 165
  • [26] Structured additive regression for overdispersed and zero-inflated-count data
    Fahrmeir, Ludwig
    Echavarria, Leyre Osuna
    APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, 2006, 22 (04) : 351 - 369
  • [27] Examples of Computing Power for Zero-Inflated and Overdispersed Count Data
    Doyle, Suzanne R.
    JOURNAL OF MODERN APPLIED STATISTICAL METHODS, 2009, 8 (02) : 360 - 376
  • [28] Estimation of mean using under-reported and overdispersed count data
    Sengupta, Debjit
    Roy, Surupa
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2024,
  • [29] Pathway-based meta-analysis for partially paired transcriptomics analysis
    Fung, Wing Tung
    Wu, Joseph T.
    Chan, Wai Man Mandy
    Chan, Henry H.
    Pang, Herbert
    RESEARCH SYNTHESIS METHODS, 2020, 11 (01) : 123 - 133
  • [30] Exponential dispersion models for overdispersed zero-inflated count data
    Bar-Lev, Shaul K.
    Ridder, Ad
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2023, 52 (07) : 3286 - 3304