Heuristic Search for DNN Graph Substitutions

被引：0

作者：

Deng, FeiFei ^{[1
]}

Liu, HongKang ^{[1
]}

机构：

[1] Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan, Peoples R China

来源：

2023 2ND ASIA CONFERENCE ON ALGORITHMS, COMPUTING AND MACHINE LEARNING, CACML 2023 | 2023年

关键词：

graph substitutions; deep neural network; graph-level optimization; heuristic search;

D O I：

10.1145/3590003.3590044

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The research and development of deep learning cannot be separated from deep neural networks (DNNs). DNNs become deeper and more complex in pursuit of accuracy and precision, leading to significantly increasing inference time and training cost. Existing deep learning frameworks optimize a DNN to improve its runtime performance by transforming computational graphs based on hand-written rules. It is hard to scale when adding some new operators into DNNs. TASO can automatically generate graph substitutions that solve maintainability problems. An optimized graph will be explored by applying a sequence of graph substitutions. However, TASO only considers the runtime performance of the model during the search, which may lose potential optimization. We propose HeuSO, a fine-grained computational graph optimizer with heuristics to handle this problem. HeuSO extracts the type and number of operators of the computational graph and classifies them into four abstract types as high-level features, which facilitate subsequent heuristic search and pruning algorithms. HeuSO generates a better sequence of graph substitutions and finds a better-optimized graph by the heuristic function, which integrates the cost and highlevel features of the model. To further reduce the time of searching, HeuSO implements a pruning algorithm. Through high-level specifications, HeuSO can quickly determine whether subgraphs of the original graph match the substitution rules. Evaluations on seven DNNs demonstrate that HeuSO outperforms state-of-the-art frameworks with 2.35x speedup while accelerating search time by up to 1.58x.

引用

页码：236 / 241

页数：6

共 16 条

[1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[2] Chen TQ, 2018, PROCEEDINGS OF THE 13TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P579
[3] Cormen T. H., 2022, Introduction to Algorithms
[4] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[5] Dong Shi, 2021, A survey on deep learning and its applications, V40, DOI [10.1016/Comput.Sci.Rev.j.cosrev.2021.100379, DOI 10.1016/COMPUT.SCI.REV.J.COSREV.2021.100379]
[6] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[7] TASO: Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions
Jia, Zhihao
Padon, Oded
Thomas, James
Warszawski, Todd
Zaharia, Matei
Aiken, Alex
[J]. PROCEEDINGS OF THE TWENTY-SEVENTH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES (SOSP '19), 2019, : 47 - 62
[8] Jia Zhihao, 2019, P MACH LEARN SYST 20
[9] Iandola FN, 2016, Arxiv, DOI arXiv:1602.07360
[10] DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion
Niu, Wei
Guan, Jiexiong
Wang, Yanzhi
Agrawal, Gagan
Ren, Bin
[J]. PROCEEDINGS OF THE 42ND ACM SIGPLAN INTERNATIONAL CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION (PLDI '21), 2021, : 883 - 898

← 1 2 →