Genome-wide association studies using binned genotypes

被引:12
作者
An, Bingxing [1 ]
Gao, Xue [1 ]
Chang, Tianpeng [1 ]
Xia, Jiangwei [2 ]
Wang, Xiaoqiao [1 ]
Miao, Jian [1 ]
Xu, Lingyang [1 ]
Zhang, Lupei [1 ]
Chen, Yan [1 ]
Li, Junya [1 ]
Xu, Shizhong [3 ]
Gao, Huijiang [1 ]
机构
[1] Chinese Acad Agr Sci, Inst Anim Sci, Beijing, Peoples R China
[2] Westlake Inst Adv Study, Inst Basic Med Sci, Hangzhou, Peoples R China
[3] Univ Calif Riverside, Dept Bot & Plant Sci, Riverside, CA 92521 USA
关键词
VARIABLE SELECTION; RIDGE-REGRESSION; MODEL; REGULARIZATION; SHRINKAGE; FRAMEWORK; TRAITS;
D O I
10.1038/s41437-019-0279-y
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Linear mixed models (LMM) that tests trait association one marker at a time have been the most popular methods for genome-wide association studies. However, this approach has potential pitfalls: over conservativeness after Bonferroni correction, ignorance of linkage disequilibrium (LD) between neighboring markers, and power reduction due to overfitting SNP effects. So, multiple locus models that can simultaneously estimate and test all markers in the genome are more appropriate. Based on the multiple locus models, we proposed a bin model that combines markers into bins based on their LD relationships. A bin is treated as a new synthetic marker and we detect the associations between bins and traits. Since the number of bins can be substantially smaller than the number of markers, a penalized multiple regression method can be adopted by fitting all bins to a single model. We developed an innovative method to bin the neighboring markers and used the least absolute shrinkage and selection operator (LASSO) method. We compared BIN-Lasso with SNP-Lasso and Q + K-LMM in a simulation experiment, and showed that the new method is more powerful with less Type I error than the other two methods. We also applied the bin model to a Chinese Simmental beef cattle population for bone weight association study. The new method identified more significant associations than the classical LMM. The bin model is a new dimension reduction technique that takes advantage of biological information (i.e., LD). The new method will be a significant breakthrough in associative genomics in the big data era.
引用
收藏
页码:288 / 298
页数:11
相关论文
共 36 条
[1]   A Unified Approach to Genotype Imputation and Haplotype-Phase Inference for Large Data Sets of Trios and Unrelated Individuals [J].
Browning, Brian L. ;
Browning, Sharon R. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2009, 84 (02) :210-223
[2]   Searching new signals for production traits through gene-based association analysis in three Italian cattle breeds [J].
Capomaccio, Stefano ;
Milanesi, Marco ;
Bomba, Lorenzo ;
Cappelli, Katia ;
Nicolazzi, Ezequiel L. ;
Williams, John L. ;
Ajmone-Marsan, Paolo ;
Stefanon, Bruno .
ANIMAL GENETICS, 2015, 46 (04) :361-370
[3]   Whole-Genome Regression and Prediction Methods Applied to Plant and Animal Breeding [J].
de los Campos, Gustavo ;
Hickey, John M. ;
Pong-Wong, Ricardo ;
Daetwyler, Hans D. ;
Calus, Mario P. L. .
GENETICS, 2013, 193 (02) :327-+
[4]   Regularization Paths for Generalized Linear Models via Coordinate Descent [J].
Friedman, Jerome ;
Hastie, Trevor ;
Tibshirani, Rob .
JOURNAL OF STATISTICAL SOFTWARE, 2010, 33 (01) :1-22
[5]   A Data-Adaptive Sum Test for Disease Association with Multiple Common or Rare Variants [J].
Han, Fang ;
Pan, Wei .
HUMAN HEREDITY, 2010, 70 (01) :42-54
[6]   Model uncertainty and variable selection in Bayesian lasso regression [J].
Hans, Chris .
STATISTICS AND COMPUTING, 2010, 20 (02) :221-229
[7]  
Hayes B, 2001, GENET SEL EVOL, V33, P209, DOI 10.1051/gse:2001117
[8]   RIDGE REGRESSION - BIASED ESTIMATION FOR NONORTHOGONAL PROBLEMS [J].
HOERL, AE ;
KENNARD, RW .
TECHNOMETRICS, 1970, 12 (01) :55-&
[9]   An Infinitesimal Model for Quantitative Trait Genomic Value Prediction [J].
Hu, Zhiqiu ;
Wang, Zhiquan ;
Xu, Shizhong .
PLOS ONE, 2012, 7 (07)
[10]   Variance component model to account for sample structure in genome-wide association studies [J].
Kang, Hyun Min ;
Sul, Jae Hoon ;
Service, Susan K. ;
Zaitlen, Noah A. ;
Kong, Sit-yee ;
Freimer, Nelson B. ;
Sabatti, Chiara ;
Eskin, Eleazar .
NATURE GENETICS, 2010, 42 (04) :348-U110