A Modification of the Lasso Method by Using the Bahadur Representation for the Genome-Wide Association Study

被引:0
作者
Utkin, Lev V. [1 ]
Zhuk, Yulia A. [2 ]
机构
[1] Peter Great St Petersburg Polytech Univ, St Petersburg, Russia
[2] ITMO Univ, St Petersburg, Russia
来源
INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS | 2018年 / 42卷 / 02期
关键词
data analysis; feature selection; Lasso; Bahadur representation; genome-wide association study;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
A modification of the Lasso method as a powerful machine learning tool applied to a genome-wide association study is proposed in the paper. From the machine learning point of view, a feature selection problem is solved in the paper, where features are single nucleotide polymorphisms or DNA-markers whose association with a quantitative trait is established. The main idea underlying the modification is to take into account correlations between DNA-markers and peculiarities of phenotype values by using the Bahadur representation of joint probabilities of binary random variables. Interactions of DNA-markers called the epistasis are also considered in the framework of the proposed modification. Various numerical experiments with real datasets illustrate the proposed modification.
引用
收藏
页码:175 / 188
页数:14
相关论文
共 57 条
[11]   Development and implementation of high-throughput SNP genotyping in barley [J].
Close, Timothy J. ;
Bhat, Prasanna R. ;
Lonardi, Stefano ;
Wu, Yonghui ;
Rostoks, Nils ;
Ramsay, Luke ;
Druka, Arnis ;
Stein, Nils ;
Svensson, Jan T. ;
Wanamaker, Steve ;
Bozdag, Serdar ;
Roose, Mikeal L. ;
Moscou, Matthew J. ;
Chao, Shiaoman ;
Varshney, Rajeev K. ;
Szuecs, Peter ;
Sato, Kazuhiro ;
Hayes, Patrick M. ;
Matthews, David E. ;
Kleinhofs, Andris ;
Muehlbauer, Gary J. ;
DeYoung, Joseph ;
Marshall, David F. ;
Madishetty, Kavitha ;
Fenton, Raymond D. ;
Condamine, Pascal ;
Graner, Andreas ;
Waugh, Robbie .
BMC GENOMICS, 2009, 10
[12]   Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans [J].
Cordell, HJ .
HUMAN MOLECULAR GENETICS, 2002, 11 (20) :2463-2468
[13]   Predicting genetic predisposition in humans: the promise of whole-genome markers [J].
de los Campos, Gustavo ;
Gianola, Daniel ;
Allison, David B. .
NATURE REVIEWS GENETICS, 2010, 11 (12) :880-886
[14]   The LASSO and Sparse Least Squares Regression Methods for SNP Selection in Predicting Quantitative Traits [J].
Feng, Zeny Z. ;
Yang, Xiaojian ;
Subedi, Sanjeena ;
McNicholas, Paul D. .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (02) :629-636
[15]   Regularization Paths for Generalized Linear Models via Coordinate Descent [J].
Friedman, Jerome ;
Hastie, Trevor ;
Tibshirani, Rob .
JOURNAL OF STATISTICAL SOFTWARE, 2010, 33 (01) :1-22
[16]   Estimating Effects and Making Predictions from Genome-Wide Marker Data [J].
Goddard, Michael E. ;
Wray, Naomi R. ;
Verbyla, Klara ;
Visscher, Peter M. .
STATISTICAL SCIENCE, 2009, 24 (04) :517-529
[17]   Bayesian two-step Lasso strategy for biomarker selection in personalized medicine development for time-to-event endpoints [J].
Gu, Xuemin ;
Yin, Guosheng ;
Lee, J. Jack .
CONTEMPORARY CLINICAL TRIALS, 2013, 36 (02) :642-650
[18]   Gene selection for cancer classification using support vector machines [J].
Guyon, I ;
Weston, J ;
Barnhill, S ;
Vapnik, V .
MACHINE LEARNING, 2002, 46 (1-3) :389-422
[19]  
Hayes Ben, 2013, Methods Mol Biol, V1019, P149, DOI 10.1007/978-1-62703-447-0_6
[20]  
Hayes P, 1997, PLANT COLD HARDINESS, P77