MARS: Exploiting Multi-Level Parallelism for DNN Workloads on Adaptive Multi-Accelerator Systems

被引:1
|
作者
Shen, Guan [1 ]
Zhao, Jieru [1 ]
Wang, Zeke [2 ]
Lin, Zhe [3 ]
Ding, Wenchao [4 ]
Wu, Chentao [1 ]
Chen, Quan [1 ]
Guo, Minyi [1 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[2] Zhejiang Univ, Hangzhou, Peoples R China
[3] Sun Yat Sen Univ, Guangzhou, Peoples R China
[4] Fudan Univ, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/DAC56929.2023.10247992
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Along with the fast evolution of deep neural networks, the hardware system is also developing rapidly. As a promising solution achieving high scalability and low manufacturing cost, multi-accelerator systems widely exist in data centers, cloud platforms, and SoCs. Thus, a challenging problem arises in multi-accelerator systems: selecting a proper combination of accelerators from available designs and searching for efficient DNN mapping strategies. To this end, we propose MARS, a novel mapping framework that can perform computation-aware accelerator selection, and apply communication-aware sharding strategies to maximize parallelism. Experimental results show that MARS can achieve 32.2% latency reduction on average for typical DNN workloads compared to the baseline, and 59.4% latency reduction on heterogeneous models compared to the corresponding state-of-the-art method.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Exploiting the thread-level parallelism for BGP on Multi-core
    Gao Lei
    Lai Mingche
    Gong Zhenghu
    CNSR 2008: PROCEEDINGS OF THE 6TH ANNUAL COMMUNICATION NETWORKS AND SERVICES RESEARCH CONFERENCE, 2008, : 510 - 516
  • [42] A Multi-Factor Adaptive Multi-Level Cooperative Replacement Policy in Block Storage Systems
    Zhou, Yang
    Wang, Fang
    Shi, Zhan
    Feng, Dan
    2022 IEEE 40TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2022), 2022, : 67 - 75
  • [43] PATRONoC: Parallel AXI Transport Reducing Overhead for Networks-on-Chip targeting Multi-Accelerator DNN Platforms at the Edge
    Jain, Vikram
    Cavalcante, Matheus
    Bruschi, Nazareno
    Rogenmoser, Michael
    Benz, Thomas
    Kurth, Andreas
    Rossi, Davide
    Benini, Luca
    Verhelst, Marian
    2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,
  • [44] ADAPTIVE FILTERING AND IDENTIFICATION OF MULTI-LEVEL TIME VARYING SYSTEMS.
    Duan, G.R.
    Deng, Z.L.
    Advances in modelling & simulation, 1987, 6 (04): : 6 - 16
  • [45] MLPPI Wizard: An Automated Multi-level Partitioning Tool on Analytical Workloads
    Suh, Young-Kyoon
    Crolotte, Alain
    Kostamaa, Pekka
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2018, 12 (04): : 1693 - 1713
  • [46] AMC: an adaptive multi-level cache algorithm in hybrid storage systems
    Cheng, Yuxia
    Chen, Wenzhi
    Wang, Zonghui
    Yu, Xinjie
    Xiang, Yang
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2015, 27 (16): : 4230 - 4246
  • [47] Improving Update-Intensive Workloads on Flash Disks through Exploiting Multi-Chip Parallelism
    He, Bingsheng
    Yu, Jeffrey Xu
    Zhou, Amelie Chi
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2015, 26 (01) : 152 - 162
  • [48] Multi-Level Parallelism Analysis of Face Detection on a Shared Memory Multi-Core System
    Chiang, Chih-Hsuan
    Kao, Chih-Heng
    Li, Guan-Ru
    Lai, Bo-Cheng Charles
    2011 INTERNATIONAL SYMPOSIUM ON VLSI DESIGN, AUTOMATION AND TEST (VLSI-DAT), 2011, : 328 - 331
  • [49] ON THE MAPPING PROBLEM FOR MULTI-LEVEL SYSTEMS
    ZIAVRAS, SG
    PROCEEDINGS : SUPERCOMPUTING 89, 1989, : 399 - 408
  • [50] Unravelling multi-level governance systems
    Zuern, Michael
    BRITISH JOURNAL OF POLITICS & INTERNATIONAL RELATIONS, 2020, 22 (04): : 784 - 791