Entropy-Based Information Gain Approaches to Detect and to Characterize Gene-Gene and Gene-Environment Interactions/Correlations of Complex Diseases

被引:52
作者
Fan, R. [1 ]
Zhong, M. [2 ]
Wang, S. [1 ,3 ]
Zhang, Y. [1 ]
Andrew, A. [4 ]
Karagas, M. [4 ]
Chen, H. [5 ]
Amos, C. I. [6 ]
Xiong, M. [7 ]
Moore, J. H. [8 ]
机构
[1] Texas A&M Univ, Dept Stat, College Stn, TX 77843 USA
[2] Abbott Labs, Abbott Pk, IL 60064 USA
[3] Yunnan Univ, Sch Informat Sci & Engn, Kunming 650091, Peoples R China
[4] Dartmouth Med Sch, Dept Community & Family Med, Lebanon, NH USA
[5] NCI, Surveillance Res Program, Rockville, MD USA
[6] Univ Texas Houston, MD Anderson Canc Ctr, Dept Epidemiol, Houston, TX 77030 USA
[7] Univ Texas Houston, Ctr Human Genet, Houston, TX USA
[8] Dartmouth Med Sch, Dept Genet, Lebanon, NH USA
关键词
gene-gene and gene-environment interactions; entropy; mutual information; interaction information; total correlation information; MULTIFACTOR-DIMENSIONALITY REDUCTION; COMBINATORIAL APPROACH; EPISTASIS; SMOKING;
D O I
10.1002/gepi.20621
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
For complex diseases, the relationship between genotypes, environment factors, and phenotype is usually complex and nonlinear. Our understanding of the genetic architecture of diseases has considerably increased over the last years. However, both conceptually and methodologically, detecting gene-gene and gene-environment interactions remains a challenge, despite the existence of a number of efficient methods. One method that offers great promises but has not yet been widely applied to genomic data is the entropy-based approach of information theory. In this article, we first develop entropy-based test statistics to identify two-way and higher order gene-gene and gene-environment interactions. We then apply these methods to a bladder cancer data set and thereby test their power and identify strengths and weaknesses. For two-way interactions, we propose an information gain (IG) approach based on mutual information. For three-ways and higher order interactions, an interaction IG approach is used. In both cases, we develop one-dimensional test statistics to analyze sparse data. Compared to the naive chi-square test, the test statistics we develop have similar or higher power and is robust. Applying it to the bladder cancer data set allowed to investigate the complex interactions between DNA repair gene single nucleotide polymorphisms, smoking status, and bladder cancer susceptibility. Although not yet widely applied, entropy-based approaches appear as a useful tool for detecting gene-gene and gene-environment interactions. The test statistics we develop add to a growing body methodologies that will gradually shed light on the complex architecture of common diseases. Genet. Epidemiol. 35:706-721, 2011. (C) 2011 Wiley Periodicals, Inc.
引用
收藏
页码:706 / 721
页数:16
相关论文
共 38 条
[1]   Concordance of multiple analytical approaches demonstrates a complex relationship between DNA repair gene SNPs, smoking and bladder cancer susceptibility [J].
Andrew, AS ;
Nelson, HH ;
Kelsey, KT ;
Moore, JH ;
Meng, AC ;
Casella, DP ;
Tosteson, TD ;
Schned, AR ;
Karagas, MR .
CARCINOGENESIS, 2006, 27 (05) :1030-1037
[2]  
[Anonymous], 2006, Elements of Information Theory
[3]  
[Anonymous], P GEN EV ALG C
[4]   William Bateson: a biologist ahead of his time [J].
Bateson, P .
JOURNAL OF GENETICS, 2002, 81 (02) :49-58
[5]  
Bateson W., 1909, MENDELS PRINCIPLES H
[6]   Information-theoretic metrics for visualizing gene-environment interactions [J].
Chanda, Pritam ;
Zhang, Aidong ;
Brazeau, Daniel ;
Sucheston, Lara ;
Freudenheim, Jo L. ;
Ambrosone, Christine ;
Ramanathan, Murali .
AMERICAN JOURNAL OF HUMAN GENETICS, 2007, 81 (05) :939-963
[7]   Exploration of gene-gene interaction effects using entropy-based methods [J].
Dong, Changzheng ;
Chu, Xun ;
Wang, Ying ;
Wang, Yi ;
Jin, Li ;
Shi, Tieliu ;
Huang, Wei ;
Li, Yixue .
EUROPEAN JOURNAL OF HUMAN GENETICS, 2008, 16 (02) :229-235
[8]  
Fisher R. A., 1919, Transactions of the Royal Society of Edinburgh, V52
[9]   Who's afraid of epistasis? [J].
Frankel, WN ;
Schork, NJ .
NATURE GENETICS, 1996, 14 (04) :371-373
[10]   Multifactor dimensionality reduction software for detecting gene-gene and gene-environment interactions [J].
Hahn, LW ;
Ritchie, MD ;
Moore, JH .
BIOINFORMATICS, 2003, 19 (03) :376-382