Contributions to High-Performance Big Data Computing

被引:2
|
作者
Fox, Geoffrey [1 ]
Qiu, Judy [1 ]
Crandall, David [1 ]
Von Laszewski, Gregor [1 ]
Beckstein, Oliver [2 ]
Paden, John [3 ]
Paraskevakos, Ioannis [4 ]
Jha, Shantenu [4 ]
Wang, Fusheng [5 ]
Marathe, Madhav [6 ,7 ]
Vullikanti, Anil [6 ,7 ]
Cheatham, Thomas [8 ]
机构
[1] Indiana Univ, Bloomington, IN USA
[2] Arizona State Univ, Tempe, AZ 85287 USA
[3] Kansas Univ, Lawrence, KS USA
[4] Rutgers State Univ, New Brunswick, NJ USA
[5] SUNY Stony Brook, Stony Brook, NY 11794 USA
[6] Virginia Tech, Blacksburg, VA USA
[7] Univ Virginia, Charlottesville, VA 22903 USA
[8] Univ Utah, Salt Lake City, UT 84112 USA
关键词
HPC; Big Data; Clouds; Graph Analytics; Polar Science; Pathology; Biomolecular simulations; Network Science; MIDAS; SPIDAL; IMAGE REGISTRATION; DATA ANALYTICS; SOFTWARE; SYSTEM; SPARK; RECONSTRUCTION; LOCALIZATION; ALGORITHMS; LIBRARY; HADOOP;
D O I
10.3233/APC190005
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Our project is at the interface of Big Data and HPC - High-Performance Big Data computing and this paper describes a collaboration between 7 collaborating Universities at Arizona State, Indiana (lead), Kansas, Rutgers, Stony Brook, Virginia Tech, and Utah. It addresses the intersection of High-performance and Big Data computing with several different application areas or communities driving the requirements for software systems and algorithms. We describe the base architecture, including the HPC-ABDS, High-Performance Computing enhanced Apache Big Data Stack, and an application use case study identifying key features that determine software and algorithm requirements. We summarize middleware including Harp-DAAL collective communication layer, Twister2 Big Data toolkit, and pilot jobs. Then we present the SPIDAL Scalable Parallel Interoperable Data Analytics Library and our work for it in core machine-learning, image processing and the application communities, Network science, Polar Science, Biomolecular Simulations, Pathology, and Spatial systems. We describe basic algorithms and their integration in end-to-end use cases.
引用
收藏
页码:34 / 81
页数:48
相关论文
共 50 条
  • [1] High-Performance Computing for Big Data Processing
    Wu, Yulei
    Xiang, Yang
    Ge, Jingguo
    Muller, Peter
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 88 : 693 - 695
  • [2] Perspectives on High-Performance Computing in a Big Data World
    Fox, Geoffrey C.
    HPDC'19: PROCEEDINGS OF THE 28TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING, 2019, : 145 - 145
  • [3] High-Performance Techniques for Big Data Computing in Internet Services
    Xu, Zhiwei
    2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC), 2012, : 1861 - 1895
  • [4] High-Performance Computing and Big Data in Omics-Based Medicine
    Merelli, Ivan
    Perez-Sanchez, Horacio
    Gesing, Sandra
    D'Agostino, Daniele
    BIOMED RESEARCH INTERNATIONAL, 2014, 2014
  • [5] How Big Data and High-performance Computing Drive Brain Science
    Shanyu Chen
    Zhipeng He
    Xinyin Han
    Xiaoyu He
    Ruilin Li
    Haidong Zhu
    Dan Zhao
    Chuangchuang Dai
    Yu Zhang
    Zhonghua Lu
    Xuebin Chi
    Beifang Niu
    Genomics,Proteomics & Bioinformatics, 2019, (04) : 381 - 392
  • [6] How Big Data and High-performance Computing Drive Brain Science
    Shanyu Chen
    Zhipeng He
    Xinyin Han
    Xiaoyu He
    Ruilin Li
    Haidong Zhu
    Dan Zhao
    Chuangchuang Dai
    Yu Zhang
    Zhonghua Lu
    Xuebin Chi
    Beifang Niu
    Genomics,Proteomics & Bioinformatics, 2019, 17 (04) : 381 - 392
  • [7] HIGH-PERFORMANCE COMPUTING BASED BIG DATA ANALYTICS FOR SMART MANUFACTURING
    Yang, Yuhang
    Cai, Y. Dora
    Lu, Qiyue
    Zhang, Yifang
    Koric, Seid
    Shao, Chenhui
    PROCEEDINGS OF THE ASME 13TH INTERNATIONAL MANUFACTURING SCIENCE AND ENGINEERING CONFERENCE, 2018, VOL 3, 2018,
  • [8] How Big Data and High-performance Computing Drive Brain Science
    Chen, Shanyu
    He, Zhipeng
    Han, Xinyin
    He, Xiaoyu
    Li, Ruilin
    Zhu, Haidong
    Zhao, Dan
    Dai, Chuangchuang
    Zhang, Yu
    Lu, Zhonghua
    Chi, Xuebin
    Niu, Beifang
    GENOMICS PROTEOMICS & BIOINFORMATICS, 2019, 17 (04) : 381 - 392
  • [9] Optimized load balancing in high-performance computing for big data analytics
    Mirtaheri, Seyedeh Leili
    Grandinetti, Lucio
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (16):
  • [10] HIGH-PERFORMANCE COMPUTING WEB SEARCH SYSTEM BASED ON COMPUTER BIG DATA
    KANG Y.
    TANG B.
    HU X.
    Scalable Comput. Pract. Exp., 3 (1932-1939): : 1932 - 1939