An Empirical Bayes risk prediction model using multiple traits for sequencing data

被引:2
作者
Li, Gengxin [1 ]
Cui, Yuehua [2 ]
Zhao, Hongyu [3 ]
机构
[1] Wright State Univ, Dept Math & Stat, Dayton, OH 45435 USA
[2] Michigan State Univ, Dept Stat & Probabil, E Lansing, MI 48824 USA
[3] Yale Univ, Sch Publ Hlth, Dept Biostat, New Haven, CT 06520 USA
基金
美国国家科学基金会; 欧洲研究理事会; 美国国家卫生研究院;
关键词
area under the ROC curve (AUC); cross validation (CV); Empirical Bayes (EB) estimate; multiple traits; receiver operating characteristic curve (ROC); RARE VARIANTS; ASSOCIATION; DISEASE; REGRESSION; DIAGNOSIS; SELECTION;
D O I
10.1515/sagmb-2015-0060
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The rapidly developing sequencing technologies have led to improved disease risk prediction through identifying many novel genes. Many prediction methods have been proposed to use rich genomic information to predict binary disease outcomes. It is intuitive that these methods can be further improved by making efficient use of the rich information in measured quantitative traits that are correlated with binary outcomes. In this article, we propose a novel Empirical Bayes prediction model that uses information from both quantitative traits and binary disease status to improve risk prediction. Our method is built on a new statistic that better infers the gene effect on multiple traits, and it also enjoys the good theoretical properties. We then consider using sequencing data by combining information from multiple rare variants in individual genes to strengthen the signals of causal genetic effects. In simulation study, we find that our proposed Empirical Bayes approach is superior to other existing methods in terms of feature selection and risk prediction. We further evaluate the effectiveness of our proposed method through its application to the sequencing data provided by the Genetic Analysis Workshop 18.
引用
收藏
页码:551 / 573
页数:23
相关论文
共 39 条
[1]   Data for Genetic Analysis Workshop 18: human whole genome sequence, blood pressure, and simulated phenotypes in extended pedigrees [J].
Laura Almasy ;
Thomas D Dyer ;
Juan M Peralta ;
Goo Jun ;
Andrew R Wood ;
Christian Fuchsberger ;
Marcio A Almeida ;
Jack W Kent ;
Sharon Fowler ;
Tom W Blackwell ;
Sobha Puppala ;
Satish Kumar ;
Joanne E Curran ;
Donna Lehman ;
Goncalo Abecasis ;
Ravindranath Duggirala ;
John Blangero .
BMC Proceedings, 8 (Suppl 1)
[2]   Genetic Analysis Workshop 17 mini-exome simulation [J].
Laura Almasy ;
Thomas D Dyer ;
Juan Manuel Peralta ;
Jack W Kent ;
Jac C Charlesworth ;
Joanne E Curran ;
John Blangero .
BMC Proceedings, 5 (Suppl 9)
[3]  
Amasyali MF, 2006, LECT NOTES COMPUT SC, V3999, P221
[4]  
[Anonymous], 2012, INTRO LINEAR REGRESS
[5]   A public resource facilitating clinical use of genomes [J].
Ball, Madeleine P. ;
Thakuria, Joseph V. ;
Zaranek, Alexander Wait ;
Clegg, Tom ;
Rosenbaum, Abraham M. ;
Wu, Xiaodi ;
Angrist, Misha ;
Bhak, Jong ;
Bobe, Jason ;
Callow, Matthew J. ;
Cano, Carlos ;
Chou, Michael F. ;
Chung, Wendy K. ;
Douglas, Shawn M. ;
Estep, Preston W. ;
Gore, Athurva ;
Hulick, Peter ;
Labarga, Alberto ;
Lee, Je-Hyuk ;
Lunshof, Jeantine E. ;
Kim, Byung Chul ;
Kim, Jong-Il ;
Li, Zhe ;
Murray, Michael F. ;
Nilsen, Geoffrey B. ;
Peters, Brock A. ;
Raman, Anugraha M. ;
Rienhoff, Hugh Y. ;
Robasky, Kimberly ;
Wheeler, Matthew T. ;
Vandewege, Ward ;
Vorhaus, Daniel B. ;
Yang, Joyce L. ;
Yang, Luhan ;
Aach, John ;
Ashley, Euan A. ;
Drmanac, Radoje ;
Kim, Seong-Jin ;
Li, Jin Billy ;
Peshkin, Leonid ;
Seidman, Christine E. ;
Seo, Jeong-Sun ;
Zhang, Kun ;
Rehm, Heidi L. ;
Church, George M. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2012, 109 (30) :11920-11927
[6]   Exome sequencing as a tool for Mendelian disease gene discovery [J].
Bamshad, Michael J. ;
Ng, Sarah B. ;
Bigham, Abigail W. ;
Tabor, Holly K. ;
Emond, Mary J. ;
Nickerson, Deborah A. ;
Shendure, Jay .
NATURE REVIEWS GENETICS, 2011, 12 (11) :745-755
[7]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[8]   ADMISSIBLE ESTIMATORS, RECURRENT DIFFUSIONS, AND INSOLUBLE BOUNDARY VALUE PROBLEMS [J].
BROWN, LD .
ANNALS OF MATHEMATICAL STATISTICS, 1971, 42 (03) :855-&
[9]   Uncovering the roles of rare variants in common disease through whole-genome sequencing [J].
Cirulli, Elizabeth T. ;
Goldstein, David B. .
NATURE REVIEWS GENETICS, 2010, 11 (06) :415-425
[10]  
Dawid AP, 1994, INST MATH S, V24, P211, DOI 10.1214/lnms/1215463797