Normalization Techniques in Training DNNs: Methodology, Analysis and Application

被引:132
作者
Huang, Lei [1 ]
Qin, Jie [2 ]
Zhou, Yi [3 ]
Zhu, Fan [4 ]
Liu, Li [4 ]
Shao, Ling [5 ]
机构
[1] Beihang Univ, Inst Artificial Intelligence, Beijing 100191, Peoples R China
[2] Nanjing Univ Aeronaut & Astronaut, Nanjing 210016, Peoples R China
[3] Southeast Univ, Nanjing 210096, Jiangsu, Peoples R China
[4] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates
[5] Univ Chinese Acad Sci, UCAS Terminus AI Lab, Beijing 101408, Peoples R China
基金
中国国家自然科学基金;
关键词
Batch normalization; deep neural networks; image classification; survey; weight normalization; OPTIMIZATION;
D O I
10.1109/TPAMI.2023.3250241
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Normalization techniques are essential for accelerating the training and improving the generalization of deep neural networks (DNNs), and have successfully been used in various applications. This paper reviews and comments on the past, present and future of normalization methods in the context of DNN training. We provide a unified picture of the main motivation behind different approaches from the perspective of optimization, and present a taxonomy for understanding the similarities and differences between them. Specifically, we decompose the pipeline of the most representative normalizing activation methods into three components: the normalization area partitioning, normalization operation and normalization representation recovery. In doing so, we provide insight for designing new normalization technique. Finally, we discuss the current progress in understanding normalization methods, and provide a comprehensive review of the applications of normalization for particular tasks, in which it can effectively solve the key issues.
引用
收藏
页码:10173 / 10196
页数:24
相关论文
共 301 条
  • [1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [2] Amjad J, 2019, ARXIV
  • [3] Anil C, 2019, PROC INT C LEARN REP
  • [4] [Anonymous], 2019, INT C LEARN REPR
  • [5] Ardakani A., 2019, P INT C LEARN REPR
  • [6] Arjovsky M, 2017, PR MACH LEARN RES, V70
  • [7] Arjovsky M, 2016, PR MACH LEARN RES, V48
  • [8] Arora S, 2019, PROC INT C LEARN REP
  • [9] Arpit Devansh, 2016, P MACHINE LEARNING R, V48
  • [10] Atanov A, 2018, PROC INT C LEARN REP