A Novel Approach to Detecting Epistasis using Random Sampling Regularisation

被引:5
作者
Hind, Jade [1 ]
Lisboa, Paulo [2 ]
Hussain, Abir [3 ]
Al-Jumeily, Dhiya [3 ]
机构
[1] Living Lens Ltd Co 49, Liverpool, Merseyside, England
[2] Liverpool John Moores Univ, Dept Appl Math, Liverpool L3 3AF, Merseyside, England
[3] Liverpool John Moores Univ, Dept Comp Sci, Liverpool L3 3AF, Merseyside, England
关键词
Breast cancer; Genomics; Bioinformatics; GWAS study; SNPs; artificial intelligence; genome; logistic regression; BREAST-CANCER; PROTEIN INTERACTIONS; STATISTICAL-ANALYSIS; GENE-EXPRESSION; REPRESENTATION; ASSOCIATION; PREDICTION; RISK;
D O I
10.1109/TCBB.2019.2948330
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Epistasis is a progressive approach that complements the 'common disease, common variant' hypothesis that highlights the potential for connected networks of genetic variants collaborating to produce a phenotypic expression. Epistasis is commonly performed as a pairwise or limitless-arity capacity that considers variant networks as either variant vs variant or as high order interactions. This type of analysis extends the number of tests that were previously performed in a standard approach such as Genome-Wide Association Study (GWAS), in which False Discovery Rate (FDR) is already an issue, therefore by multiplying the number of tests up to a factorial rate also increases the issue of FDR. Further to this, epistasis introduces its own limitations of computational complexity and intensity that are generated based on the analysis performed; to consider the most intense approach, a multivariate analysis introduces a time complexity of O(n!). Proposed in this paper is a novel methodology for the detection of epistasis using interpretable methods and best practice to outline interactions through filtering processes. Using a process of Random Sampling Regularisation which randomly splits and produces sample sets to conduct a voting system to regularise the significance and reliability of biological markers, SNPs. Preliminary results are promising, outlining a concise detection of interactions. Results for the detection of epistasis, in the classification of breast cancer patients, indicated eight outlined risk candidate interactions from five variants and a singular candidate variant with high protective association.
引用
收藏
页码:1535 / 1545
页数:11
相关论文
共 67 条
[1]   Data quality control in genetic case-control association studies [J].
Anderson, Carl A. ;
Pettersson, Fredrik H. ;
Clarke, Geraldine M. ;
Cardon, Lon R. ;
Morris, Andrew P. ;
Zondervan, Krina T. .
NATURE PROTOCOLS, 2010, 5 (09) :1564-1573
[2]  
[Anonymous], 2018, BREAST CANC SYMPT CA
[3]  
[Anonymous], 2018, Breast cancer statistics
[4]  
Ardou B., 1999, NOVEL INDICATIONS BR, V84, P263
[5]   Apoptosis, cancer and cancer therapy [J].
Bold, RJ ;
Termuhlen, PM ;
McConkey, DJ .
SURGICAL ONCOLOGY-OXFORD, 1997, 6 (03) :133-142
[6]   Basic statistical analysis in genetic case-control studies [J].
Clarke, Geraldine M. ;
Anderson, Carl A. ;
Pettersson, Fredrik H. ;
Cardon, Lon R. ;
Morris, Andrew P. ;
Zondervan, Krina T. .
NATURE PROTOCOLS, 2011, 6 (02) :121-133
[7]   The "one size fits all'' approach to trauma treatment: should we be satisfied? [J].
Cloitre, Marylene .
EUROPEAN JOURNAL OF PSYCHOTRAUMATOLOGY, 2015, 6
[8]   Identifying Stages of Kidney Renal Cell Carcinoma by Combining Gene Expression and DNA Methylation Data [J].
Deng, Su-Ping ;
Cao, Shaolong ;
Huang, De-Shuang ;
Wang, Yu-Ping .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2017, 14 (05) :1147-1153
[9]   Predicting Hub Genes Associated with Cervical Cancer through Gene Co-Expression Networks [J].
Deng, Su-Ping ;
Zhu, Lin ;
Huang, De-Shuang .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2016, 13 (01) :27-35
[10]   Mining the bladder cancer-associated genes by an integrated strategy for the construction and analysis of differential co-expression networks [J].
Deng, Su-Ping ;
Zhu, Lin ;
Huang, De-Shuang .
BMC GENOMICS, 2015, 16