Heuristic Search for DNN Graph Substitutions

被引:0
作者
Deng, FeiFei [1 ]
Liu, HongKang [1 ]
机构
[1] Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan, Peoples R China
来源
2023 2ND ASIA CONFERENCE ON ALGORITHMS, COMPUTING AND MACHINE LEARNING, CACML 2023 | 2023年
关键词
graph substitutions; deep neural network; graph-level optimization; heuristic search;
D O I
10.1145/3590003.3590044
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The research and development of deep learning cannot be separated from deep neural networks (DNNs). DNNs become deeper and more complex in pursuit of accuracy and precision, leading to significantly increasing inference time and training cost. Existing deep learning frameworks optimize a DNN to improve its runtime performance by transforming computational graphs based on hand-written rules. It is hard to scale when adding some new operators into DNNs. TASO can automatically generate graph substitutions that solve maintainability problems. An optimized graph will be explored by applying a sequence of graph substitutions. However, TASO only considers the runtime performance of the model during the search, which may lose potential optimization. We propose HeuSO, a fine-grained computational graph optimizer with heuristics to handle this problem. HeuSO extracts the type and number of operators of the computational graph and classifies them into four abstract types as high-level features, which facilitate subsequent heuristic search and pruning algorithms. HeuSO generates a better sequence of graph substitutions and finds a better-optimized graph by the heuristic function, which integrates the cost and highlevel features of the model. To further reduce the time of searching, HeuSO implements a pruning algorithm. Through high-level specifications, HeuSO can quickly determine whether subgraphs of the original graph match the substitution rules. Evaluations on seven DNNs demonstrate that HeuSO outperforms state-of-the-art frameworks with 2.35x speedup while accelerating search time by up to 1.58x.
引用
收藏
页码:236 / 241
页数:6
相关论文
共 16 条
  • [1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [2] Chen TQ, 2018, PROCEEDINGS OF THE 13TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P579
  • [3] Cormen T. H., 2022, Introduction to Algorithms
  • [4] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [5] Dong Shi, 2021, A survey on deep learning and its applications, V40, DOI [10.1016/Comput.Sci.Rev.j.cosrev.2021.100379, DOI 10.1016/COMPUT.SCI.REV.J.COSREV.2021.100379]
  • [6] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [7] TASO: Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions
    Jia, Zhihao
    Padon, Oded
    Thomas, James
    Warszawski, Todd
    Zaharia, Matei
    Aiken, Alex
    [J]. PROCEEDINGS OF THE TWENTY-SEVENTH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES (SOSP '19), 2019, : 47 - 62
  • [8] Jia Zhihao, 2019, P MACH LEARN SYST 20
  • [9] Iandola FN, 2016, Arxiv, DOI arXiv:1602.07360
  • [10] DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion
    Niu, Wei
    Guan, Jiexiong
    Wang, Yanzhi
    Agrawal, Gagan
    Ren, Bin
    [J]. PROCEEDINGS OF THE 42ND ACM SIGPLAN INTERNATIONAL CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION (PLDI '21), 2021, : 883 - 898