HEXO: Offloading Long-Running Compute- and Memory-Intensive Workloads on Low-Cost, Low-Power Embedded Systems

被引:0
作者
Olivier, Pierre [1 ]
Mehrab, A. K. M. Fazla [2 ]
Errabelly, Sandeep [2 ]
Lankes, Stefan [4 ]
Karaoui, Mohamed Lamine [5 ]
Lyerly, Robert [2 ]
Kim, Sang-Hoon [6 ]
Barbalace, Antonio [7 ]
Ravindran, Binoy [3 ]
机构
[1] Univ Manchester, Dept Comp Sci, Manchester M13 9PL, England
[2] Virginia Tech, Blacksburg, VA 24061 USA
[3] Virginia Tech, Syst Software Res Grp, Blacksburg, VA 24061 USA
[4] Rhein Westfal TH Aachen, Inst Automat Complex Power Syst, D-52062 Aachen, Germany
[5] Huawei, F-92100 Boulogne Billancourt, France
[6] Ajou Univ, Suwon 16499, South Korea
[7] Univ Edinburgh, Sch Informat, Edinburgh EH8 9YL, Scotland
基金
英国工程与自然科学研究理事会; 美国国家科学基金会;
关键词
Servers; Embedded systems; Benchmark testing; Power demand; Costs; Memory management; Throughput; Random access memory; Linux; Virtual machine monitors; Heterogeneous ISAs; unikernels; migration; offloading; MODEL;
D O I
10.1109/TCC.2024.3482178
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
OS-capable embedded systems exhibiting a very low power consumption are available at an extremely low price point. It makes them highly compelling in a datacenter context. We show that sharing long-running, compute-intensive datacenter workloads between a server machine and one or a few connected embedded boards of negligible cost and power consumption can yield significant performance and energy benefits. Our approach, named Heterogeneous EXecution Offloading (HEXO), selectively offloads Virtual Machines (VMs) from server-class machines to embedded boards. Our design tackles several challenges. We address the Instruction Set Architecture (ISA) difference between typical servers (x86) and embedded systems (ARM) through hypervisor and guest OS-level support for heterogeneous-ISA runtime VM migration. We cope with the low amount of resources in embedded systems by using lightweight VMs - unikernels - and by using the server's free RAM as remote memory for embedded boards through a transparent lightweight memory disaggregation mechanism for heterogeneous server-embedded clusters, called Netswap. VMs are offloaded based on an estimation of the slowdown expected from running on a given board. We build a prototype of HEXO and demonstrate significant increases in throughput (up to 67%) and energy efficiency (up to 56%) using benchmarks representative of compute-intensive long-running workloads.
引用
收藏
页码:1415 / 1432
页数:18
相关论文
共 71 条
  • [1] Al Maruf H, 2020, PROCEEDINGS OF THE 2020 USENIX ANNUAL TECHNICAL CONFERENCE, P843
  • [2] Exploiting Reuse Locality on Inclusive Shared Last-Level Caches
    Albericio, Jorge
    Ibanez, Pablo
    Vinals, Victor
    Maria Llaberia, Jose
    [J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2013, 9 (04)
  • [3] SF-LRU cache replacement algorithm
    Alghazo, J
    Akaaboune, A
    Botros, N
    [J]. RECORDS OF THE 2004 IEEE INTERNATIONAL WORKSHOP ON MEMORY TECHNOLOGY, DESIGN AND TESTING, 2004, : 19 - 24
  • [4] Can Far Memory Improve Job Throughput?
    Amaro, Emmanuel
    Branner-Augmon, Christopher
    Luo, Zhihong
    Ousterhout, Amy
    Aguilera, Marcos K.
    Panda, Aurojit
    Ratnasamy, Sylvia
    Shenker, Scott
    [J]. PROCEEDINGS OF THE FIFTEENTH EUROPEAN CONFERENCE ON COMPUTER SYSTEMS (EUROSYS'20), 2020,
  • [5] [Anonymous], [37] [Online]. Available: https://www.hpe.com/us/en/storage.html
  • [6] [Anonymous], 2010, Proceedings of the 8th International Conference on Mobile Systems, Applications, and Services, ACM, DOI [10.1145/1814433.1814441, DOI 10.1145/1814433.1814441]
  • [7] [Anonymous], [2] http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/xeon-phi-life-sciences-computing-paper.pdf.
  • [8] [Anonymous], [70] [Online]. Available: https://en.wiktionary.org/wiki/.
  • [9] Impact of an open-chest extracorporeal membrane oxygenation model for in situ simulated team training: a pilot study
    Atamanyuk, Iryna
    Ghez, Olivier
    Saeed, Imran
    Lane, Mary
    Hall, Judith
    Jackson, Tim
    Desai, Ajay
    Burmester, Margarita
    [J]. INTERACTIVE CARDIOVASCULAR AND THORACIC SURGERY, 2014, 18 (01) : 17 - 20
  • [10] THE NAS PARALLEL BENCHMARKS
    BAILEY, DH
    BARSZCZ, E
    BARTON, JT
    BROWNING, DS
    CARTER, RL
    DAGUM, L
    FATOOHI, RA
    FREDERICKSON, PO
    LASINSKI, TA
    SCHREIBER, RS
    SIMON, HD
    VENKATAKRISHNAN, V
    WEERATUNGA, SK
    [J]. INTERNATIONAL JOURNAL OF SUPERCOMPUTER APPLICATIONS AND HIGH PERFORMANCE COMPUTING, 1991, 5 (03): : 63 - 73