A High-Performance Computing Implementation of Iterative Random Forest for the Creation of Predictive Expression Networks

被引:24
|
作者
Cliff, Ashley [1 ,2 ]
Romero, Jonathon [1 ,2 ]
Kainer, David [2 ]
Walker, Angelica [1 ,2 ]
Furches, Anna [1 ,2 ]
Jacobson, Daniel [1 ,2 ]
机构
[1] Univ Tennessee, Bredesen Ctr Interdisciplinary Res & Grad Educ, Knoxville, TN 37996 USA
[2] Oak Ridge Natl Lab, POB 2009, Oak Ridge, TN 37830 USA
关键词
Random Forest; Iterative Random Forest; Gene Expression Networks; high-performance computing; X-AI-based eQTL;
D O I
10.3390/genes10120996
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
As time progresses and technology improves, biological data sets are continuously increasing in size. New methods and new implementations of existing methods are needed to keep pace with this increase. In this paper, we present a high-performance computing (HPC)-capable implementation of Iterative Random Forest (iRF). This new implementation enables the explainable-AI eQTL analysis of SNP sets with over a million SNPs. Using this implementation, we also present a new method, iRF Leave One Out Prediction (iRF-LOOP), for the creation of Predictive Expression Networks on the order of 40,000 genes or more. We compare the new implementation of iRF with the previous R version and analyze its time to completion on two of the world's fastest supercomputers, Summit and Titan. We also show iRF-LOOP's ability to capture biologically significant results when creating Predictive Expression Networks. This new implementation of iRF will enable the analysis of biological data sets at scales that were previously not possible.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] High-performance computing in simulation of milk crown
    Masao Yokoyama
    Kouhei Murotani
    Genki Yagawa
    Computational Particle Mechanics, 2019, 6 : 249 - 256
  • [32] A review on the decarbonization of high-performance computing centers
    Silva, C. A.
    Vilaca, R.
    Pereira, A.
    Bessa, R. J.
    RENEWABLE & SUSTAINABLE ENERGY REVIEWS, 2024, 189
  • [33] Modular High-Performance Computing Using Chiplets
    Vinnakota, Bapi
    Shalf, John M.
    COMPUTING IN SCIENCE & ENGINEERING, 2023, 25 (06) : 39 - 48
  • [34] A Pattern Language for High-Performance Computing Resilience
    Hukerikar, Saurabh
    Engelmann, Christian
    PROCEEDINGS OF THE 22ND EUROPEAN CONFERENCE ON PATTERN LANGUAGES OF PROGRAMS (EUROPLOP 2017), 2017,
  • [35] The Use of The High-Performance Computing in The Learning Process
    Serik, Meruert
    Yerlanova, Gulmira
    Karelkhan, Nursaule
    Temirbekov, Nurlykhan
    INTERNATIONAL JOURNAL OF EMERGING TECHNOLOGIES IN LEARNING, 2021, 16 (17) : 240 - 254
  • [36] Modeling Microreactor Requirements for High-Performance Computing
    Lee, Alvin J. H.
    Wodrich, Lucas
    Kalinichenko, Dimitri
    Brooks, Caleb S.
    Kozlowski, Tomasz
    NUCLEAR TECHNOLOGY, 2024, 210 (06) : 1027 - 1041
  • [37] Data Analysis and Visualization in High-Performance Computing
    Szczepariski, Amy F.
    Huang, Jian
    Baer, Troy
    Mack, Yashema C.
    Ahern, Sean
    COMPUTER, 2013, 46 (05) : 84 - 92
  • [38] Web Portals for High-performance Computing: A Survey
    Calegari, Patrice
    Levrier, Marc
    Balczynski, Pawel
    ACM TRANSACTIONS ON THE WEB, 2019, 13 (01)
  • [40] High-performance Computing for Visual Simulations and Rendering
    Wu, Jasmine
    Kuo, Chia-Chen
    JOURNAL OF ROBOTICS NETWORKING AND ARTIFICIAL LIFE, 2019, 6 (02): : 101 - 104