Normalization Techniques in Training DNNs: Methodology, Analysis and Application

被引：189

作者：

Huang, Lei ^{[1
]}

Qin, Jie ^{[2
]}

Zhou, Yi ^{[3
]}

Zhu, Fan ^{[4
]}

Liu, Li ^{[4
]}

Shao, Ling ^{[5
]}

机构：

[1] Beihang Univ, Inst Artificial Intelligence, Beijing 100191, Peoples R China

[2] Nanjing Univ Aeronaut & Astronaut, Nanjing 210016, Peoples R China

[3] Southeast Univ, Nanjing 210096, Jiangsu, Peoples R China

[4] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates

[5] Univ Chinese Acad Sci, UCAS Terminus AI Lab, Beijing 101408, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2023年 / 45卷 / 08期

基金：

中国国家自然科学基金;

关键词：

Batch normalization; deep neural networks; image classification; survey; weight normalization; OPTIMIZATION;

D O I：

10.1109/TPAMI.2023.3250241

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Normalization techniques are essential for accelerating the training and improving the generalization of deep neural networks (DNNs), and have successfully been used in various applications. This paper reviews and comments on the past, present and future of normalization methods in the context of DNN training. We provide a unified picture of the main motivation behind different approaches from the perspective of optimization, and present a taxonomy for understanding the similarities and differences between them. Specifically, we decompose the pipeline of the most representative normalizing activation methods into three components: the normalization area partitioning, normalization operation and normalization representation recovery. In doing so, we provide insight for designing new normalization technique. Finally, we discuss the current progress in understanding normalization methods, and provide a comprehensive review of the applications of normalization for particular tasks, in which it can effectively solve the key issues.

引用

页码：10173 / 10196

页数：24

共 301 条

[1]

Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265

[2]

Amjad J., 2019, ARXIV

[3]

Anil C, 2019, PROC INT C LEARN REP

[4]

[Anonymous], 2008, Functions of Matrices: Theory and Computation

[5]

[Anonymous], 2018, P INT C LEARN REPR I

[6]

[Anonymous], Automatic differentiation in PyTorch

[7]

[Anonymous], 2018, Advances in Neural Information Processing Systems

[8]

[Anonymous], 2016, Reducing Overfitting in Deep Networks by Decorrelating Representations

[9]

[Anonymous], 2013, Pmlr, DOI DOI 10.48550/ARXIV.1211.5063

[10]

ARDAKANI A, 2019, INT C LEARN REPR

← 1 2 3 4 5 6 7 8 9 10 →