Revisiting Orthogonality Regularization: A Study for Convolutional Neural Networks in Image Classification

被引：6

作者：

Kim, Taehyeon ^{[1
]}

Yun, Se-Young ^{[1
]}

机构：

[1] Korea Adv Inst Sci & Technol KAIST, Kim Jaechul Grad Sch Artificial Intelligence, Seoul 02455, South Korea

来源：

IEEE ACCESS | 2022年 / 10卷

关键词：

Training; Kernel; Convolutional neural networks; Redundancy; Artificial intelligence; Licenses; Signal to noise ratio; Deep neural network (DNN); kernel; orthogonality regularization; convolutional neural network (CNN); regularization; image classification;

D O I：

10.1109/ACCESS.2022.3185621

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recent research in deep Convolutional Neural Networks(CNN) faces the challenges of vanishing/exploding gradient issues, training instability, and feature redundancy. Orthogonality Regularization(OR), which introduces a penalty function considering the orthogonality of neural networks, could be a remedy to these challenges but is surprisingly not popular in the literature. This work revisits the OR approaches and empirically answer the question: Even when comparing various regularizations like weight decay and spectral norm regularization, which is the most powerful OR technique? We begin by introducing the improvements of various regularization techniques, specifically focusing on OR approaches over a variety of architectures. After that, we disentangle the benefits of OR in the comparison of other regularization approaches with a connection on how they affect norm preservation effects and feature redundancy in the forward and backward propagation. Our investigations show that Kernel Orthogonality Regularization(KOR) approaches, which directly penalize the orthogonality of convolutional kernel matrices, consistently outperform other techniques. We propose a simple KOR method considering both row- and column- orthogonality, of which empirical performance is the most effective in mitigating the aforementioned challenges. We further discuss several circumstances in the recent CNN models on various benchmark datasets, wherein KOR gains more effectiveness.

引用

页码：69741 / 69749

页数：9

共 30 条

[1] Allen-Zhu Z, 2019, ADV NEUR IN, V32
[2] Balestriero Randall, 2018, PMLR, P374
[3] Bansal N., 2018, Advances in Neural Information Processing Systems
[4] Dai Z., 2021, arXiv
[5] Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]
[6] Glorot X., 2010, P 13 INT C ART INT S, P249
[7] Hanin B, 2018, ADV NEUR IN, V31
[8] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[9] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[10] Howard A, 2019, Arxiv, DOI [arXiv:1905.02244, DOI 10.48550/ARXIV.1905.02244]

← 1 2 3 →