In-Place Activated BatchNorm for Memory-Optimized Training of DNNs

被引：182

作者：

Bulo, Samuel Rota ^{[1
]}

Porzi, Lorenzo ^{[1
]}

Kontschieder, Peter ^{[1
]}

机构：

[1] Mapillary Res, Graz, Austria

来源：

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年

关键词：

D O I：

10.1109/CVPR.2018.00591

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this work we present In-Place Activated Batch Normalization (INPLACE-ABN) - a novel approach to drastically reduce the training memory footprint of modern deep neural networks in a computationally efficient way. Our solution substitutes the conventionally used succession of BatchNorm + Activation layers with a single plugin layer, hence avoiding invasive framework surgery while providing straightforward applicability for existing deep learning frameworks. We obtain memory savings of up to 50% by dropping intermediate results and by recovering required information during the backward pass through the inversion of stored forward results, with only minor increase (0.8-2%) in computation time. Also, we demonstrate how frequently used checkpointing approaches can be made computationally as efficient as INPLACE-ABN. In our experiments on image classification, we demonstrate on-par results on ImageNet-1k with state-of-the-art approaches. On the memory-demanding task of semantic segmentation, we report competitive results for COCO-Stuff and set new state-of-the-art results for Cityscapes and Mapillary Vistas. Code can be found at https://github.com/mapillary/inplace_abn.

引用

页码：5639 / 5647

页数：9

共 50 条

[21] Memory-optimized of collision detection algorithm based on bounding-volume hierarchies
Wang Meng
Wang Xiaorong
Jin Hanjun
Advanced Computer Technology, New Education, Proceedings, 2007, : 229 - 232
[22] Engineering In-place (Shared-memory) Sorting Algorithms
Axtmann, Michael
Witt, Sascha
Ferizovic, Daniel
Sanders, Peter
ACM TRANSACTIONS ON PARALLEL COMPUTING, 2022, 9 (01)
[23] In-Place Zero-Space Memory Protection for CNN
Guan, Hui
Ning, Lin
Lin, Zhen
Shen, Xipeng
Zhou, Huiyang
Lim, Seung-Hwan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[24] A Memory-Optimized and Energy-Efficient CNN Acceleration Architecture Based on FPGA
Chang, Xuepeng
Pan, Huihui
Zhang, Dun
Sun, Qiming
Lin, Weiyang
2019 IEEE 28TH INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2019, : 2137 - 2141
[25] HiEngine: How to Architect a Cloud-Native Memory-Optimized Database Engine
Ma, Yunus
Xie, Siphrey
Zhong, Henry
Lee, Leon
Lv, King
PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22), 2022, : 2177 - 2190
[26] A Latency-Optimized Reconfigurable NoC for In-Memory Acceleration of DNNs
Mandal, Sumit K.
Krishnan, Gokul
Chakrabarti, Chaitali
Seo, Jae-Sun
Cao, Yu
Ogras, Umit Y.
IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2020, 10 (03) : 362 - 375
[27] Accelerating Electromagnetic Field Simulations Based on Memory-Optimized CPML-FDTD with OpenACC
Padilla-Perez, Diego
Medina-Sanchez, Isaac
Hernandez, Jorge
Couder-Castaneda, Carlos
APPLIED SCIENCES-BASEL, 2022, 12 (22):
[28] Optimized Stencil Computation Using In-Place Calculation on Modern Multicore Systems
Augustin, Werner
Heuveline, Vincent
Weiss, Jan-Philipp
EURO-PAR 2009: PARALLEL PROCESSING, PROCEEDINGS, 2009, 5704 : 772 - +
[29] Visualization of large medical data sets using memory-optimized CPU and GPU algorithms
Kiefer, G
Lehmann, H
Weese, E
MEDICAL IMAGING 2005: VISUALIZATION, IMAGE-GUIDED PROCEDURES, AND DISPLAY, PTS 1 AND 2, 2005, 5744 : 677 - 687
[30] Memory-Optimized Re-Gridding Architecture for Non-Uniform Fast Fourier Transform
Cheema, Umer I.
Nash, Gregory
Ansari, Rashid
Khokhar, Ashfaq
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2017, 64 (07) : 1853 - 1864

← 1 2 3 4 5 →