Generalized least squares;
Heteroscedasticity;
Large p small n;
Model selection;
Sparse regression;
Variance estimation;
VARIABLE SELECTION;
SHRINKAGE;
D O I:
10.1111/j.1541-0420.2011.01652.x
中图分类号:
Q [生物科学];
学科分类号:
07 ;
0710 ;
09 ;
摘要:
We consider the problem of high-dimensional regression under nonconstant error variances. Despite being a common phenomenon in biological applications, heteroscedasticity has, so far, been largely ignored in high-dimensional analysis of genomic data sets. We propose a new methodology that allows nonconstant error variances for high-dimensional estimation and model selection. Our method incorporates heteroscedasticity by simultaneously modeling both the mean and variance components via a novel doubly regularized approach. Extensive Monte Carlo simulations indicate that our proposed procedure can result in better estimation and variable selection than existing methods when heteroscedasticity arises from the presence of predictors explaining error variances and outliers. Further, we demonstrate the presence of heteroscedasticity in and apply our method to an expression quantitative trait loci (eQTLs) study of 112 yeast segregants. The new procedure can automatically account for heteroscedasticity in identifying the eQTLs that are associated with gene expression variations and lead to smaller prediction errors. These results demonstrate the importance of considering heteroscedasticity in eQTL data analysis.
机构:
Univ Penn, Wharton Sch, Dept Stat & Data Sci, Philadelphia, PA 19104 USAUniv Penn, Wharton Sch, Dept Stat & Data Sci, Philadelphia, PA 19104 USA
Hong, David
Yang, Fan
论文数: 0引用数: 0
h-index: 0
机构:
Tsinghua Univ, Yau Math Sci Ctr, Beijing 100084, Peoples R China
Beijing Inst Math Sci & Applicat, Beijing 100084, Peoples R ChinaUniv Penn, Wharton Sch, Dept Stat & Data Sci, Philadelphia, PA 19104 USA
Yang, Fan
Fessler, Jeffrey A.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Michigan, Dept Elect Engn & Comp Sci, Ann Arbor, MI 48109 USAUniv Penn, Wharton Sch, Dept Stat & Data Sci, Philadelphia, PA 19104 USA
Fessler, Jeffrey A.
Balzano, Laura
论文数: 0引用数: 0
h-index: 0
机构:
Univ Michigan, Dept Elect Engn & Comp Sci, Ann Arbor, MI 48109 USAUniv Penn, Wharton Sch, Dept Stat & Data Sci, Philadelphia, PA 19104 USA
Balzano, Laura
SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE,
2023,
5
(01):
: 222
-
250
机构:
Oakland Univ, Dept Math & Stat, Rochester, MI 48309 USAOakland Univ, Dept Math & Stat, Rochester, MI 48309 USA
Gao, Xiaoli
Huang, Jian
论文数: 0引用数: 0
h-index: 0
机构:
Univ Iowa, Dept Biostat, Iowa City, IA 52242 USA
Univ Iowa, Dept Stat & Actuarial Sci, Iowa City, IA 52242 USAOakland Univ, Dept Math & Stat, Rochester, MI 48309 USA
机构:
Sungkyunkwan Univ, Dept Stat, Seoul 03063, South KoreaSungkyunkwan Univ, Dept Stat, Seoul 03063, South Korea
Lee, Eun Ryung
Park, Seyoung
论文数: 0引用数: 0
h-index: 0
机构:
Sungkyunkwan Univ, Dept Stat, Seoul 03063, South KoreaSungkyunkwan Univ, Dept Stat, Seoul 03063, South Korea
Park, Seyoung
Lee, Sang Kyu
论文数: 0引用数: 0
h-index: 0
机构:
Michigan State Univ, Dept Stat & Probabil, E Lansing, MI 48823 USA
NCI, Biostat Branch, Bethesda, MD 20892 USASungkyunkwan Univ, Dept Stat, Seoul 03063, South Korea
Lee, Sang Kyu
Hong, Hyokyoung G.
论文数: 0引用数: 0
h-index: 0
机构:
NCI, Biostat Branch, Bethesda, MD 20892 USASungkyunkwan Univ, Dept Stat, Seoul 03063, South Korea