Pathway-based genetic association analysis for overdispersed count data

被引：0

作者：

Liu, Yang ^{[1
]}

机构：

[1] Wright State Univ, Dept Math & Stat, 3640 Colonel Glenn Hwy, Dayton, OH 45435 USA

来源：

JOURNAL OF APPLIED STATISTICS | 2025年

基金：

美国国家卫生研究院;

关键词：

Overdispersion; association analysis; negative binomial regression; mixed effects; somatic mutations; DIFFERENTIAL EXPRESSION ANALYSIS; RARE-VARIANT ASSOCIATION; TESTS;

D O I：

10.1080/02664763.2025.2460073

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

Overdispersion is a common phenomenon in genetic data, such as gene expression count data. In genetic association studies, it is important to investigate the association between a gene expression and a set of genetic variants from a pathway. However, existing approaches for pathway analysis are primarily designed for continuous and binary outcomes and are not applicable to overdispersed count data. In this paper, we propose a hierarchical approach to analyze the association between an overdispersed count response and a set of low-frequency genetic variants in negative binomial regression. We derive score-type test statistics for both fixed and random effects of genetic variants, and further introduce a novel procedure for efficiently combining these two statistics for global testing. Through simulation studies, we demonstrate that the proposed method tends to be more powerful than existing methods under a wide range of scenarios. Additionally, we apply the proposed method to a colorectal cancer study, demonstrating its power in identifying associations between gene expression and somatic mutations.

引用

页数：15

共 50 条

[21] Saddlepoint tests for accurate and robust inference on overdispersed count data
Aeberhard, William H.
Cantoni, Eva
Heritier, Stephane
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2017, 107 : 162 - 175
[22] Mean and Variance Modeling of Under- and Overdispersed Count Data
Smith, David M.
Faddy, Malcolm J.
JOURNAL OF STATISTICAL SOFTWARE, 2016, 69 (06): : 1 - 23
[23] A hyper-Poisson regression model for overdispersed and underdispersed count data
Saez-Castillo, A. J.
Conde-Sanchez, A.
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2013, 61 : 148 - 157
[24] Count Regression and Machine Learning Techniques for Zero-Inflated Overdispersed Count Data: Application to Ecological Data
Sidumo B.
Sonono E.
Takaidza I.
Annals of Data Science, 2024, 11 (03) : 803 - 817
[25] A marginalized model for zero-inflated, overdispersed and correlated count data
Iddia, Samuel
Molenberghs, Geert
ELECTRONIC JOURNAL OF APPLIED STATISTICAL ANALYSIS, 2013, 6 (02) : 149 - 165
[26] Structured additive regression for overdispersed and zero-inflated-count data
Fahrmeir, Ludwig
Echavarria, Leyre Osuna
APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, 2006, 22 (04) : 351 - 369
[27] Examples of Computing Power for Zero-Inflated and Overdispersed Count Data
Doyle, Suzanne R.
JOURNAL OF MODERN APPLIED STATISTICAL METHODS, 2009, 8 (02) : 360 - 376
[28] Estimation of mean using under-reported and overdispersed count data
Sengupta, Debjit
Roy, Surupa
COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2024,
[29] Pathway-based meta-analysis for partially paired transcriptomics analysis
Fung, Wing Tung
Wu, Joseph T.
Chan, Wai Man Mandy
Chan, Henry H.
Pang, Herbert
RESEARCH SYNTHESIS METHODS, 2020, 11 (01) : 123 - 133
[30] Exponential dispersion models for overdispersed zero-inflated count data
Bar-Lev, Shaul K.
Ridder, Ad
COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2023, 52 (07) : 3286 - 3304

← 1 2 3 4 5 →