In-Place Activated BatchNorm for Memory-Optimized Training of DNNs

被引:182
|
作者
Bulo, Samuel Rota [1 ]
Porzi, Lorenzo [1 ]
Kontschieder, Peter [1 ]
机构
[1] Mapillary Res, Graz, Austria
关键词
D O I
10.1109/CVPR.2018.00591
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work we present In-Place Activated Batch Normalization (INPLACE-ABN) - a novel approach to drastically reduce the training memory footprint of modern deep neural networks in a computationally efficient way. Our solution substitutes the conventionally used succession of BatchNorm + Activation layers with a single plugin layer, hence avoiding invasive framework surgery while providing straightforward applicability for existing deep learning frameworks. We obtain memory savings of up to 50% by dropping intermediate results and by recovering required information during the backward pass through the inversion of stored forward results, with only minor increase (0.8-2%) in computation time. Also, we demonstrate how frequently used checkpointing approaches can be made computationally as efficient as INPLACE-ABN. In our experiments on image classification, we demonstrate on-par results on ImageNet-1k with state-of-the-art approaches. On the memory-demanding task of semantic segmentation, we report competitive results for COCO-Stuff and set new state-of-the-art results for Cityscapes and Mapillary Vistas. Code can be found at https://github.com/mapillary/inplace_abn.
引用
收藏
页码:5639 / 5647
页数:9
相关论文
共 50 条
  • [21] Memory-optimized of collision detection algorithm based on bounding-volume hierarchies
    Wang Meng
    Wang Xiaorong
    Jin Hanjun
    Advanced Computer Technology, New Education, Proceedings, 2007, : 229 - 232
  • [22] Engineering In-place (Shared-memory) Sorting Algorithms
    Axtmann, Michael
    Witt, Sascha
    Ferizovic, Daniel
    Sanders, Peter
    ACM TRANSACTIONS ON PARALLEL COMPUTING, 2022, 9 (01)
  • [23] In-Place Zero-Space Memory Protection for CNN
    Guan, Hui
    Ning, Lin
    Lin, Zhen
    Shen, Xipeng
    Zhou, Huiyang
    Lim, Seung-Hwan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [24] A Memory-Optimized and Energy-Efficient CNN Acceleration Architecture Based on FPGA
    Chang, Xuepeng
    Pan, Huihui
    Zhang, Dun
    Sun, Qiming
    Lin, Weiyang
    2019 IEEE 28TH INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2019, : 2137 - 2141
  • [25] HiEngine: How to Architect a Cloud-Native Memory-Optimized Database Engine
    Ma, Yunus
    Xie, Siphrey
    Zhong, Henry
    Lee, Leon
    Lv, King
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22), 2022, : 2177 - 2190
  • [26] A Latency-Optimized Reconfigurable NoC for In-Memory Acceleration of DNNs
    Mandal, Sumit K.
    Krishnan, Gokul
    Chakrabarti, Chaitali
    Seo, Jae-Sun
    Cao, Yu
    Ogras, Umit Y.
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2020, 10 (03) : 362 - 375
  • [27] Accelerating Electromagnetic Field Simulations Based on Memory-Optimized CPML-FDTD with OpenACC
    Padilla-Perez, Diego
    Medina-Sanchez, Isaac
    Hernandez, Jorge
    Couder-Castaneda, Carlos
    APPLIED SCIENCES-BASEL, 2022, 12 (22):
  • [28] Optimized Stencil Computation Using In-Place Calculation on Modern Multicore Systems
    Augustin, Werner
    Heuveline, Vincent
    Weiss, Jan-Philipp
    EURO-PAR 2009: PARALLEL PROCESSING, PROCEEDINGS, 2009, 5704 : 772 - +
  • [29] Visualization of large medical data sets using memory-optimized CPU and GPU algorithms
    Kiefer, G
    Lehmann, H
    Weese, E
    MEDICAL IMAGING 2005: VISUALIZATION, IMAGE-GUIDED PROCEDURES, AND DISPLAY, PTS 1 AND 2, 2005, 5744 : 677 - 687
  • [30] Memory-Optimized Re-Gridding Architecture for Non-Uniform Fast Fourier Transform
    Cheema, Umer I.
    Nash, Gregory
    Ansari, Rashid
    Khokhar, Ashfaq
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2017, 64 (07) : 1853 - 1864