A comparison of internal validation techniques for multifactor dimensionality reduction

被引:9
|
作者
Winham, Stacey J. [1 ]
Slater, Andrew J. [2 ,3 ]
Motsinger-Reif, Alison A. [1 ,2 ]
机构
[1] N Carolina State Univ, Dept Stat, Raleigh, NC 27695 USA
[2] N Carolina State Univ, Bioinformat Res Ctr, Raleigh, NC 27695 USA
[3] N Carolina State Univ, Dept Genet, Raleigh, NC 27695 USA
来源
BMC BIOINFORMATICS | 2010年 / 11卷
关键词
GENE-GENE INTERACTIONS; MULTIPLE-SCLEROSIS; HUMAN-DISEASE; EPISTASIS; SUSCEPTIBILITY;
D O I
10.1186/1471-2105-11-394
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: It is hypothesized that common, complex diseases may be due to complex interactions between genetic and environmental factors, which are difficult to detect in high-dimensional data using traditional statistical approaches. Multifactor Dimensionality Reduction (MDR) is the most commonly used data-mining method to detect epistatic interactions. In all data-mining methods, it is important to consider internal validation procedures to obtain prediction estimates to prevent model over-fitting and reduce potential false positive findings. Currently, MDR utilizes cross-validation for internal validation. In this study, we incorporate the use of a three-way split (3WS) of the data in combination with a post-hoc pruning procedure as an alternative to cross-validation for internal model validation to reduce computation time without impairing performance. We compare the power to detect true disease causing loci using MDR with both 5- and 10-fold cross-validation to MDR with 3WS for a range of single-locus and epistatic disease models. Additionally, we analyze a dataset in HIV immunogenetics to demonstrate the results of the two strategies on real data. Results: MDR with 3WS is computationally approximately five times faster than 5-fold cross-validation. The power to find the exact true disease loci without detecting false positive loci is higher with 5-fold cross-validation than with 3WS before pruning. However, the power to find the true disease causing loci in addition to false positive loci is equivalent to the 3WS. With the incorporation of a pruning procedure after the 3WS, the power of the 3WS approach to detect only the exact disease loci is equivalent to that of MDR with cross-validation. In the real data application, the cross-validation and 3WS analyses indicate the same two-locus model. Conclusions: Our results reveal that the performance of the two internal validation methods is equivalent with the use of pruning procedures. The specific pruning procedure should be chosen understanding the trade-off between identifying all relevant genetic effects but including false positives and missing important genetic factors. This implies 3WS may be a powerful and computationally efficient approach to screen for epistatic effects, and could be used to identify candidate interactions in large-scale genetic studies.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Exhaustive Variant Interaction Analysis Using Multifactor Dimensionality Reduction
    Gomez-Sanchez, Gonzalo
    Alonso, Lorena
    Perez, Miguel Angel
    Moran, Ignasi
    Torrents, David
    Berral, Josep Ll.
    APPLIED SCIENCES-BASEL, 2024, 14 (12):
  • [22] Improved Multiobjective Multifactor Dimensionality Reduction using Fuzzy Theory
    Yang, Cheng-Hong
    Moi, Sin-Hua
    Chuang, Li-Yeh
    Shih, Tien-Tsorng
    Lin, Yu-Da
    2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 2476 - 2481
  • [23] Boosting Multifactor Dimensionality Reduction Using Pre-evaluation
    Hong, Yingfu
    Lee, Sangbum
    Oh, Sejong
    ETRI JOURNAL, 2016, 38 (01) : 206 - 215
  • [24] Comparison of multifactor dimensionality reduction (MDR) and neural network analysis: serotonergic system genes and antipsychotic response
    De Luca, V.
    Souza, R. P.
    Tiwari, A.
    Meltzer, H. Y.
    Potkin, S. G.
    Volavka, J.
    Liebennan, J.
    Kennedy, J. L.
    Serretti, A.
    EUROPEAN NEUROPSYCHOPHARMACOLOGY, 2010, 20 : S495 - S496
  • [25] Comparing Dimensionality Reduction Techniques
    Nick, William
    Shelton, Joseph
    Bullock, Gina
    Esterline, Albert
    Asamene, Kassahun
    IEEE SOUTHEASTCON 2015, 2015,
  • [26] A Review on Dimensionality Reduction Techniques
    Huang, Xuan
    Wu, Lei
    Ye, Yinsong
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2019, 33 (10)
  • [27] Identification of interactions using model-based multifactor dimensionality reduction
    Damian Gola
    Inke R. König
    BMC Proceedings, 10 (Suppl 7)
  • [28] Multiobjective multifactor dimensionality reduction to detect SNP-SNP interactions
    Yang, Cheng-Hong
    Chuang, Li-Yeh
    Lin, Yu-Da
    BIOINFORMATICS, 2018, 34 (13) : 2228 - 2236
  • [29] On Video Textures Generation: A Comparison Between Different Dimensionality Reduction Techniques
    Fan, Wentao
    Bouguila, Nizar
    2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 5134 - 5139
  • [30] Comparison and evaluation of dimensionality reduction techniques for the numerical simulations of unsteady cavitation
    Zhang, Guiyong
    Wang, Zihao
    Huang, Huakun
    Li, Hang
    Sun, Tiezhi
    PHYSICS OF FLUIDS, 2023, 35 (07)