A Comprehensive Cloud Architecture for Machine Learning-enabled Research

被引:0
|
作者
Stubbs, Joe [1 ]
Freeman, Nathan [1 ]
Indrakusuma, Dhanny [1 ]
Garcia, Christian [1 ]
Halbach, Francois [1 ]
Hammock, Cody [1 ]
Curbelo, Gilbert [1 ]
Jamthe, Anagha [1 ]
Packard, Mike [1 ]
Fields, Alex [1 ]
机构
[1] Texas Adv Comp Ctr, Austin, TX 78758 USA
基金
美国国家科学基金会;
关键词
GPUs; Cloud Computing; Machine Learning;
D O I
10.1145/3626203.3670525
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The success of machine learning (ML) algorithms, and deep learning in particular, is having a transformative impact on a wide range of research disciplines, from astronomy, materials science, and climate change to bioinformatics, computational health, and animal ecology. At the same time, these new techniques introduce computational modalities that create challenges for academic computing centers and resource providers that have historically focused on asynchronous, batch-computing paradigms. In particular, there is an emergent need for computing models that enable efficient use of specialized hardware such as graphical processing units (GPUs) in the presence of interactive workloads. In this paper, we present a comprehensive, cloud-based architecture comprised of open-source software layers to better meet the needs of modern ML processes and workloads. This framework, deployed at the Texas Advanced Computing Center and in use by various research teams, provides different interfaces at varying levels of abstraction to support and simplify the tasks of users with different backgrounds and expertise, and to efficiently leverage limited GPU resources for these tasks. We present techniques and implementation details for overcoming challenges related to developing and maintaining such an infrastructure which will be of interest to service providers and infrastructure developers alike.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Machine Learning-Enabled Optical Architecture Design of Perovskite Solar Cells
    Li, Zong-Zheng
    Guo, Chaorong
    Lv, Wenlei
    Huang, Peng
    Zhang, Yongyou
    JOURNAL OF PHYSICAL CHEMISTRY LETTERS, 2024, 15 (14): : 3835 - 3842
  • [2] Machine learning-enabled retrobiosynthesis of molecules
    Yu, Tianhao
    Boob, Aashutosh Girish
    Volk, Michael J.
    Liu, Xuan
    Cui, Haiyang
    Zhao, Huimin
    NATURE CATALYSIS, 2023, 6 (2) : 137 - 151
  • [3] Boosting Vehicle-to-Cloud Communication by Machine Learning-Enabled Context Prediction
    Sliwa, Benjamin
    Falkenberg, Robert
    Liebig, Thomas
    Piatkowski, Nico
    Wietfeld, Christian
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2020, 21 (08) : 3497 - 3512
  • [4] Machine learning-enabled retrobiosynthesis of molecules
    Tianhao Yu
    Aashutosh Girish Boob
    Michael J. Volk
    Xuan Liu
    Haiyang Cui
    Huimin Zhao
    Nature Catalysis, 2023, 6 : 137 - 151
  • [5] Machine Learning-Enabled Personalization of Programming Learning Feedback
    Alshammari, Mohammad T.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2025, 16 (02) : 1091 - 1097
  • [6] Machine Learning-Enabled Zero Touch Networks
    Shami, Abdallah
    Ong, Lyndon
    IEEE COMMUNICATIONS MAGAZINE, 2023, 61 (02) : 80 - 80
  • [7] Machine Learning-Enabled Smart Sensor Systems
    Ha, Nam
    Xu, Kai
    Ren, Guanghui
    Mitchell, Arnan
    Ou, Jian Zhen
    ADVANCED INTELLIGENT SYSTEMS, 2020, 2 (09)
  • [8] Machine learning-enabled multiplexed microfluidic sensors
    Dabbagh, Sajjad Rahmani
    Rabbi, Fazle
    Dogan, Zafer
    Yetisen, Ali Kemal
    Tasoglu, Savas
    BIOMICROFLUIDICS, 2020, 14 (06)
  • [9] MACHINE LEARNING-ENABLED ZERO TOUCH NETWORKS
    Shami, Abdallah
    Ong, Lyndon
    IEEE COMMUNICATIONS MAGAZINE, 2023, 61 (06) : 50 - 50
  • [10] Commentary: Towards machine learning-enabled epidemiology
    Jorm, Louisa R.
    INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2020, 49 (06) : 1770 - 1773