High-Dimensional Heteroscedastic Regression with an Application to eQTL Data Analysis

被引:36
作者
Daye, Z. John [1 ]
Chen, Jinbo [1 ]
Li, Hongzhe [1 ]
机构
[1] Univ Penn, Sch Med, Dept Biostat & Epidemiol, Philadelphia, PA 19104 USA
基金
美国国家卫生研究院;
关键词
Generalized least squares; Heteroscedasticity; Large p small n; Model selection; Sparse regression; Variance estimation; VARIABLE SELECTION; SHRINKAGE;
D O I
10.1111/j.1541-0420.2011.01652.x
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We consider the problem of high-dimensional regression under nonconstant error variances. Despite being a common phenomenon in biological applications, heteroscedasticity has, so far, been largely ignored in high-dimensional analysis of genomic data sets. We propose a new methodology that allows nonconstant error variances for high-dimensional estimation and model selection. Our method incorporates heteroscedasticity by simultaneously modeling both the mean and variance components via a novel doubly regularized approach. Extensive Monte Carlo simulations indicate that our proposed procedure can result in better estimation and variable selection than existing methods when heteroscedasticity arises from the presence of predictors explaining error variances and outliers. Further, we demonstrate the presence of heteroscedasticity in and apply our method to an expression quantitative trait loci (eQTLs) study of 112 yeast segregants. The new procedure can automatically account for heteroscedasticity in identifying the eQTLs that are associated with gene expression variations and lead to smaller prediction errors. These results demonstrate the importance of considering heteroscedasticity in eQTL data analysis.
引用
收藏
页码:316 / 326
页数:11
相关论文
共 50 条
  • [21] A systematic review on model selection in high-dimensional regression
    Lee, Eun Ryung
    Cho, Jinwoo
    Yu, Kyusang
    JOURNAL OF THE KOREAN STATISTICAL SOCIETY, 2019, 48 (01) : 1 - 12
  • [22] High-Dimensional Expected Shortfall Regression
    Zhang, Shushu
    He, Xuming
    Tan, Kean Ming
    Zhou, Wen-Xin
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2025,
  • [23] Penalized weighted smoothed quantile regression for high-dimensional longitudinal data
    Song, Yanan
    Han, Haohui
    Fu, Liya
    Wang, Ting
    STATISTICS IN MEDICINE, 2024, 43 (10) : 2007 - 2042
  • [24] On constrained and regularized high-dimensional regression
    Shen, Xiaotong
    Pan, Wei
    Zhu, Yunzhang
    Zhou, Hui
    ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2013, 65 (05) : 807 - 832
  • [25] High-Dimensional Constrained Huber Regression
    Wei, Quan
    Zhao, Ziping
    2024 IEEE 13RD SENSOR ARRAY AND MULTICHANNEL SIGNAL PROCESSING WORKSHOP, SAM 2024, 2024,
  • [26] Robust high-dimensional regression for data with anomalous responses
    Ren, Mingyang
    Zhang, Sanguo
    Zhang, Qingzhao
    ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2021, 73 (04) : 703 - 736
  • [27] Quantile forward regression for high-dimensional survival data
    Eun Ryung Lee
    Seyoung Park
    Sang Kyu Lee
    Hyokyoung G. Hong
    Lifetime Data Analysis, 2023, 29 : 769 - 806
  • [28] Adaptive Bayesian density regression for high-dimensional data
    Shen, Weining
    Ghosal, Subhashis
    BERNOULLI, 2016, 22 (01) : 396 - 420
  • [29] Variable selection via combined penalization for high-dimensional data analysis
    Wang, Xiaoming
    Park, Taesung
    Carriere, K. C.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2010, 54 (10) : 2230 - 2243
  • [30] Improved two-stage model averaging for high-dimensional linear regression, with application to Riboflavin data analysis
    Pan, Juming
    BMC BIOINFORMATICS, 2021, 22 (01)