Argo NodeOS: Toward Unified Resource Management for Exascale

被引:7
|
作者
Perarnau, Swann [1 ]
Zounmevo, Judicael A. [1 ]
Dreher, Matthieu [1 ]
Van Essen, Brian C. [3 ]
Gioiosa, Roberto [2 ]
Iskra, Kamil [1 ]
Gokhale, Maya B. [3 ]
Yoshii, Kazutomo [1 ]
Beckman, Pete [1 ]
机构
[1] Argonne Natl Lab, Argonne, IL 60439 USA
[2] Pacific Northwest Natl Lab, Richland, WA 99352 USA
[3] Lawrence Livermore Natl Lab, Livermore, CA USA
来源
2017 31ST IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS) | 2017年
基金
美国国家科学基金会;
关键词
D O I
10.1109/IPDPS.2017.25
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Exascale systems are expected to feature hundreds of thousands of compute nodes with hundreds of hardware threads and complex memory hierarchies with a mix of on-package and persistent memory modules. In this context, the Argo project is developing a new operating system for exascale machines. Targeting production workloads using workflows or coupled codes, we improve the Linux kernel on several fronts. We extend the memory management of Linux to be able to subdivide NUMA memory nodes, allowing better resource partitioning among processes running on the same node. We also add support for memory-mapped access to node-local, PCIe-attached NVRAM devices and introduce a new scheduling class targeted at parallel runtimes supporting user-level load balancing. These features are unified into compute containers, a containerization approach focused on providing modern HPC applications with dynamic control over a wide range of kernel interfaces. To keep our approach compatible with industrial containerization products, we also identify contentions points for the adoption of containers in HPC settings. Each NodeOS feature is evaluated by using a set of parallel benchmarks, miniapps, and coupled applications consisting of simulation and data analysis components, running on a modern NUMA platform. We observe out-of-the-box performance improvements easily matching, and often exceeding, those observed with expert-optimized configurations on standard OS kernels. Our lightweight approach to resource management retains the many benefits of a full OS kernel that application programmers have learned to depend on, at the same time providing a set of extensions that can be freely mixed and matched to best benefit particular application components.
引用
收藏
页码:153 / 162
页数:10
相关论文
共 50 条
  • [1] Resilience-Aware Resource Management for Exascale Computing Systems
    Dauwe, Daniel
    Pasricha, Sudeep
    Maciejewski, Anthony A.
    Siegel, Howard Jay
    IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING, 2018, 3 (04): : 332 - 345
  • [2] TOWARD EXASCALE RESILIENCE
    Cappello, Franck
    Geist, Al
    Gropp, Bill
    Kale, Laxmikant
    Kramer, Bill
    Snir, Marc
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2009, 23 (04): : 374 - 388
  • [3] Unified Resource Manager virtualization management
    Mayer, C.
    Baitinger, F.
    Amann, S.
    McAfee, G.
    Mencias, A. Nunez
    IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2012, 56 (1-2)
  • [4] CHINA INCHES TOWARD THE EXASCALE
    Courtland, Rachel
    IEEE SPECTRUM, 2016, 53 (08) : 14 - 15
  • [5] Looking toward Exascale Computing
    Beckman, Pete
    PDCAT 2008: NINTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS, 2008, : 3 - 3
  • [6] Quantum ESPRESSO toward the exascale
    Giannozzi, Paolo
    Baseggio, Oscar
    Bonfa, Pietro
    Brunato, Davide
    Car, Roberto
    Carnimeo, Ivan
    Cavazzoni, Carlo
    de Gironcoli, Stefano
    Delugas, Pietro
    Ruffino, Fabrizio Ferrari
    Ferretti, Andrea
    Marzari, Nicola
    Timrov, Iurii
    Urru, Andrea
    Baroni, Stefano
    JOURNAL OF CHEMICAL PHYSICS, 2020, 152 (15):
  • [7] A Unified Cooperative Radio Resource Management Game
    Yang, Chungang
    2013 IEEE/CIC INTERNATIONAL CONFERENCE ON COMMUNICATIONS IN CHINA (ICCC), 2013, : 242 - 246
  • [8] TOWARD A UNIFIED MANAGEMENT OF A COMPLEX ROAD NETWORK
    Torta, Fabio
    Fermi, Francesca
    PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON TRAFFIC AND TRANSPORT ENGINEERING (ICTTE), 2014, : 141 - 149
  • [9] TOWARD A UNIFIED THEORY OF MANAGEMENT - KOONTZ,H
    COLEMAN, WE
    CONTEMPORARY PSYCHOLOGY, 1966, 11 (07): : 347 - 347
  • [10] TOWARD A UNIFIED THEORY OF MANAGEMENT - KOONTZ,H
    ROBERTSON, DNS
    PERSONNEL JOURNAL, 1964, 43 (11) : 626 - 627