Computational performance of heterogeneous ensemble frameworks on high-performance computing platforms

被引:0
|
作者
Wang, Linhua [1 ]
Timsina, Prem [2 ]
Pandey, Gaurav [1 ]
机构
[1] Icahn Sch Med Mt Sinai, Dept Genet & Genom Sci, New York, NY 10029 USA
[2] Mt Sinai Hlth Syst, New York, NY USA
来源
2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2020年
关键词
Ensembles; predictive modeling; high-performance computing; Hadoop; computational performance; PROTEIN FUNCTION; SPARK; CLASSIFICATION;
D O I
10.1109/BigData50022.2020.9378392
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To enable efficient computations on rapidly growing big data, a variety of high-performance computing (HPC) platforms, such as traditional multi-processor systems, Hadoop and cloud computing systems, have been developed. On the analytics side of big data, several innovative machine learning methods have been developed to enable the extraction of accurate and actionable knowledge from large datasets. In particular, heterogeneous ensemble algorithms, which are designed to aggregate an unrestricted variety and number of analytical models, have performed well for a variety of prediction problems. However, the performance of these algorithms in terms of computational metrics, such as time requirement, disk space consumption and memory usage, on these HPC platforms has not been systematically examined yet. Here, we address this gap in knowledge by implementing these algorithms and systematically assessing their computational performance on traditional HPC and Hadoop platforms. Our results show that these implementations used the resources, especially disk space and memory, consistent with the respective designs of the platforms. Furthermore, due to the iterative nature of the heterogeneous ensemble computations, the traditional HPC system executed them faster than Hadoop, since an in-memory design is better suited for them than a disk-based one. Overall, our study sheds new light on the computational performance of ensemble algorithms and software frameworks on two prominent HPC platforms, and offers a systematic methodology for conducting similar assessments for other data analytics methods as well. Basic source code of our heterogeneous ensemble implementations, as well as the HPC performance assessments, are available at https://github.com/GauravPandeyLab/HPC-Ensemble.
引用
收藏
页码:2843 / 2850
页数:8
相关论文
共 50 条
  • [1] Computational biology and high-performance computing
    Bader, DA
    COMMUNICATIONS OF THE ACM, 2004, 47 (11) : 34 - 41
  • [2] High-performance computing for computational science
    Gil-Costa, Veronica
    Senger, Hermes
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2020, 32 (20):
  • [3] High-Performance Tucker Factorization on Heterogeneous Platforms
    Oh, Sejoon
    Park, Namyong
    Jang, Jun-Gi
    Sael, Lee
    Kang, U.
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (10) : 2237 - 2248
  • [4] New prospects for computational hydraulics by leveraging high-performance heterogeneous computing techniques
    Liang, Qiuhua
    Smith, Luke
    Xia, Xilin
    PROCEEDINGS OF THE SECOND CONFERENCE OF GLOBAL CHINESE SCHOLARS ON HYDRODYNAMICS (CCSH'2016), VOLS 1 & 2, 2016, : 272 - 279
  • [5] New prospects for computational hydraulics by leveraging high-performance heterogeneous computing techniques
    Qiuhua LIANG
    Luke SMITH
    Xilin XIA
    JournalofHydrodynamics, 2016, 28 (06) : 977 - 985
  • [6] New prospects for computational hydraulics by leveraging high-performance heterogeneous computing techniques
    Qiuhua Liang
    Luke Smith
    Xilin Xia
    Journal of Hydrodynamics, 2016, 28 : 977 - 985
  • [7] New prospects for computational hydraulics by leveraging high-performance heterogeneous computing techniques
    Liang, Qiuhua
    Smith, Luke
    Xia, Xilin
    JOURNAL OF HYDRODYNAMICS, 2016, 28 (06) : 977 - 985
  • [8] Predictive Resource Management for Next-Generation High-Performance Computing Heterogeneous Platforms
    Massari, Giuseppe
    Pupykina, Anna
    Agosta, Giovanni
    Fornaciari, William
    EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION, SAMOS 2019, 2019, 11733 : 470 - 483
  • [9] Optimizing FHEW With Heterogeneous High-Performance Computing
    Lei, Xinya
    Guo, Ruixin
    Zhang, Feng
    Wang, Lizhe
    Xu, Rui
    Qu, Guangzhi
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2020, 16 (08) : 5335 - 5344
  • [10] Micromagnetics on high-performance workstation and mobile computational platforms
    Fu, S.
    Chang, R.
    Couture, S.
    Menarini, M.
    Escobar, M. A.
    Kuteifan, M.
    Lubarda, M.
    Gabay, D.
    Lomakin, V.
    JOURNAL OF APPLIED PHYSICS, 2015, 117 (17)