Robust linear regression methods in association studies

被引:44
|
作者
Lourenco, V. M. [1 ]
Pires, A. M. [2 ,3 ]
Kirst, M. [4 ]
机构
[1] Univ Nova Lisboa, Fac Ciencias & Tecnol, Dept Math, P-2829516 Caparica, Portugal
[2] Univ Tecn Lisboa, Inst Super Tecn,Dept Math, P-1049001 Lisbon, Portugal
[3] Univ Tecn Lisboa, CEMAT, Inst Super Tecn, P-1049001 Lisbon, Portugal
[4] Univ Florida, Genet Inst, Plant Mol & Cellular Biol Program, Sch Forest Resources & Conservat, Gainesville, FL 32611 USA
关键词
SINGLE-NUCLEOTIDE POLYMORPHISMS; MAYS SSP PARVIGLUMIS; LINKAGE DISEQUILIBRIUM; STRUCTURED POPULATIONS; QUANTITATIVE TRAITS; GENETIC-ASSOCIATION; CANDIDATE GENES; STRATIFICATION; STATISTICS; INFERENCE;
D O I
10.1093/bioinformatics/btr006
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: It is well known that data deficiencies, such as coding/rounding errors, outliers or missing values, may lead to misleading results for many statistical methods. Robust statistical methods are designed to accommodate certain types of those deficiencies, allowing for reliable results under various conditions. We analyze the case of statistical tests to detect associations between genomic individual variations (SNP) and quantitative traits when deviations from the normality assumption are observed. We consider the classical analysis of variance tests for the parameters of the appropriate linear model and a robust version of those tests based on M-regression. We then compare their empirical power and level using simulated data with several degrees of contamination. Results: Data normality is nothing but a mathematical convenience. In practice, experiments usually yield data with non-conforming observations. In the presence of this type of data, classical least squares statistical methods perform poorly, giving biased estimates, raising the number of spurious associations and often failing to detect true ones. We show through a simulation study and a real data example, that the robust methodology can be more powerful and thus more adequate for association studies than the classical approach.
引用
收藏
页码:815 / 821
页数:7
相关论文
共 50 条
  • [31] Penalized Multimarker vs. Single-Marker Regression Methods for Genome-Wide Association Studies of Quantitative Traits
    Yi, Hui
    Breheny, Patrick
    Imam, Netsanet
    Liu, Yongmei
    Hoeschele, Ina
    GENETICS, 2015, 199 (01) : 205 - U334
  • [32] Association Mapping by Generalized Linear Regression With Density-Based Haplotype Clustering
    Igo, Robert P., Jr.
    Li, Jing
    Goddard, Katrina A. B.
    GENETIC EPIDEMIOLOGY, 2009, 33 (01) : 16 - 26
  • [33] Weighted functional linear regression models for gene-based association analysis
    Belonogova, Nadezhda M.
    Svishcheva, Gulnara R.
    Wilson, James F.
    Campbell, Harry
    Axenovich, Tatiana I.
    PLOS ONE, 2018, 13 (01):
  • [34] Robust relationship inference in genome-wide association studies
    Manichaikul, Ani
    Mychaleckyj, Josyf C.
    Rich, Stephen S.
    Daly, Kathy
    Sale, Michele
    Chen, Wei-Min
    BIOINFORMATICS, 2010, 26 (22) : 2867 - 2873
  • [35] On the Analysis of Genome-Wide Association Studies in Family-Based Designs: A Universal, Robust Analysis Approach and an Application to Four Genome-Wide Association Studies
    Won, Sungho
    Wilk, Jemma B.
    Mathias, Rasika A.
    O'Donnell, Christopher J.
    Silverman, Edwin K.
    Barnes, Kathleen
    O'Connor, George T.
    Weiss, Scott T.
    Lange, Christoph
    PLOS GENETICS, 2009, 5 (11)
  • [36] Multilocus association testing with penalized regression
    Basu, Saonli
    Pan, Wei
    Shen, Xiaotong
    Oetting, William S.
    GENETIC EPIDEMIOLOGY, 2011, 35 (08) : 755 - 765
  • [37] Methods and Tools for Bayesian Variable Selection and Model Averaging in Normal Linear Regression
    Forte, Anabel
    Garcia-Donato, Gonzalo
    Steel, Mark
    INTERNATIONAL STATISTICAL REVIEW, 2018, 86 (02) : 237 - 258
  • [38] Optimal use of regression models in genome-wide association studies
    Powell, J. E.
    Kranis, A.
    Floyd, J.
    Dekkers, J. C. M.
    Knott, S.
    Haley, C. S.
    ANIMAL GENETICS, 2012, 43 (02) : 133 - 143
  • [39] Transcriptome wide association studies: general framework and methods
    Xie, Yuhan
    Shan, Nayang
    Zhao, Hongyu
    Hou, Lin
    QUANTITATIVE BIOLOGY, 2021, 9 (02) : 141 - 150
  • [40] Linear regression: robust heteroscedastic confidence bands that have some specified simultaneous probability coverage
    Wilcox, Rand R.
    JOURNAL OF APPLIED STATISTICS, 2017, 44 (14) : 2564 - 2574