LAGrad: Statically Optimized Differentiable Programming in MLIR

被引:1
作者
Peng, Mai Jacob [1 ]
Dubach, Christophe [1 ,2 ]
机构
[1] McGill Univ, Montreal, PQ, Canada
[2] Mila, Montreal, PQ, Canada
来源
PROCEEDINGS OF THE 32ND ACM SIGPLAN INTERNATIONAL CONFERENCE ON COMPILER CONSTRUCTION, CC 2023 | 2023年
基金
加拿大自然科学与工程研究理事会;
关键词
automatic differentiation; MLIR; differentiable programming; static analysis; sparsity;
D O I
10.1145/3578360.3580259
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Automatic differentiation (AD) is a central algorithm in deep learning and the emerging field of differentiable programming. However, the performance of AD remains a significant bottleneck in these fields. Training large models requires repeatedly evaluating gradients via AD potentially millions of times. Additionally, the most common form of AD incurs an asymptotically large memory cost relative to the original function being differentiated. This paper introduces LAGrad, a reverse-mode, source-to-source AD system that leverages high-level information in MLIR to produce efficient differentiated code. LAGrad employs a collection of novel static optimizations that benefit from the semantics of high-level MLIR dialects to exploit the sparsity and structured control flow of generated code. Using these, LAGrad is able to achieve speedups of up to 2.8x and use 35x less memory relative to state of the art AD systems on real-world machine learning and computer vision benchmarks.
引用
收藏
页码:228 / 238
页数:11
相关论文
共 25 条
  • [1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [2] Anderson E., 1999, LAPACK USERS GUIDE
  • [3] Belbute-Peres FD, 2018, ADV NEUR IN, V31
  • [4] EFFICIENTLY COMPUTING STATIC SINGLE ASSIGNMENT FORM AND THE CONTROL DEPENDENCE GRAPH
    CYTRON, R
    FERRANTE, J
    ROSEN, BK
    WEGMAN, MN
    ZADECK, FK
    [J]. ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 1991, 13 (04): : 451 - 490
  • [5] Frostig Roy, 2018, Systems for Machine Learning, V4, P9
  • [6] Griewank Andreas, 1989, Mathematical Programming: recent developments and applications, V6, P83
  • [7] To be recorded analysis in reverse-mode automatic differentiation
    Hascoët, L
    Naumann, U
    Pascual, V
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2005, 21 (08): : 1401 - 1417
  • [8] The Tapenade Automatic Differentiation Tool: Principles, Model, and Specification
    Hascoet, Laurent
    Pascual, Valerie
    [J]. ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2013, 39 (03):
  • [9] Innes M, 2019, Arxiv, DOI arXiv:1810.07951
  • [10] Innes M, 2019, Arxiv, DOI [arXiv:1907.07587, DOI 10.48550/ARXIV.1907.07587]