MiniMalloc: A Lightweight Memory Allocator for Hardware-Accelerated Machine Learning

被引:1
作者
Moffitt, Michael D. [1 ]
机构
[1] Google, Mountain View, CA 94043 USA
来源
PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, ASPLOS 2023, VOL 4 | 2023年
关键词
memory allocation; hardware acceleration; machine learning; ARCHITECTURE; PACKING;
D O I
10.1145/3623278.3624752
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We present a new approach to static memory allocation, a key problem that arises in the compilation of machine learning models onto the resources of a specialized hardware accelerator. Our methodology involves a recursive depth-first search that limits exploration to a special class of canonical solutions, dramatically reducing the size of the search space. We also develop a spatial inference technique that exploits this special structure by pruning unpromising partial assignments and backtracking more effectively than otherwise possible. Finally, we introduce a new mechanism capable of detecting and eliminating dominated solutions from consideration. Empirical results demonstrate orders of magnitude improvement in performance as compared to the previous state-of-the-art on many benchmarks, as well as a substantial reduction in library size.
引用
收藏
页码:238 / 252
页数:15
相关论文
共 75 条
[1]  
Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[2]   Searching for Efficient Neural Architectures for On-Device ML on Edge TPUs [J].
Akin, Berkin ;
Gupta, Suyog ;
Long, Yun ;
Spiridonov, Anton ;
Wang, Zhuo ;
White, Marie ;
Xu, Hao ;
Zhou, Ping ;
Zhou, Yanqi .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, :2666-2675
[3]  
Armando A, 2000, LECT NOTES ARTIF INT, V1809, P97
[4]   Optimization Space Pruning without Regrets [J].
Beaugnon, Ulysse ;
Pouille, Antoine ;
Pouzet, Marc ;
Pienaar, Jacques ;
Cohen, Albert .
CC'17: PROCEEDINGS OF THE 26TH INTERNATIONAL CONFERENCE ON COMPILER CONSTRUCTION, 2017, :34-44
[5]  
Beletsky P, 2013, ICCAD-IEEE ACM INT, P473, DOI 10.1109/ICCAD.2013.6691159
[6]  
Berger ED, 2000, ACM SIGPLAN NOTICES, V35, P117, DOI 10.1145/384264.379232
[7]  
Berger Martin, 2009, A constraint-based approach for the two-dimensional rectangular packing problem with orthogonal orientations, P427, DOI [10.1007/978-3-642-00142-0_69, DOI 10.1007/978-3-642-00142-0_69]
[8]   ON STATIC MEMORY ALLOCATION IN COMPUTER SYSTEMS [J].
BOVET, DP ;
ESTRIN, G .
IEEE TRANSACTIONS ON COMPUTERS, 1970, C 19 (06) :492-&
[9]   Marvel: A Data-Centric Approach for Mapping Deep Learning Operators on Spatial Accelerators [J].
Chatarasi, Prasanth ;
Kwon, Hyoukjun ;
Parashar, Angshuman ;
Pellauer, Michael ;
Krishna, Tushar ;
Sarkar, Vivek .
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2022, 19 (01)
[10]  
Chen TQ, 2018, PROCEEDINGS OF THE 13TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P579