Emergence of Invariance and Disentanglement in Deep Representations

被引：0

作者：

Achille, Alessandro ^{[1
]}

Soatto, Stefano ^{[1
]}

机构：

[1] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90095 USA

来源：

2018 INFORMATION THEORY AND APPLICATIONS WORKSHOP (ITA) | 2018年

关键词：

Deep learning; neural network; representation; flat minima; information bottleneck; overfitting; generalization; sufficiency; minimality; sensitivity; information complexity; stochastic gradient descent; regularization; total correlation; PAC-Bayes;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Using established principles from Information Theory and Statistics, we show that in a deep neural network invariance to nuisance factors is equivalent to information minimality of the learned representation, and that stacking layers and injecting noise during training naturally bias the network towards learning invariant representations. We then show that, in order to avoid memorization, we need to limit the quantity of information stored in the weights, which leads to a novel usage of the Information Bottleneck Lagrangian on the weights as a learning criterion. This also has an alternative interpretation as minimizing a PAC-Bayesian bound on the test error. Finally, we exploit a duality between weights and activations induced by the architecture, to show that the information in the weights bounds the minimality and Total Correlation of the layers, therefore showing that regularizing the weights explicitly or implicitly, using SGD, not only helps avoid overfitting, but also fosters invariance and disentangling of the learned representation. The theory also enables predicting sharp phase transitions between underfitting and overfitting random labels at precise information values, and sheds light on the relation between the geometry of the loss function, in particular so-called "flat minima," and generalization.

引用

页数：29

共 50 条

[31] How meaningful are similarities in deep trajectory representations?
Taghizadeh, Saeed
Elekes, Abel
Schaeler, Martin
Boehm, Klemens
INFORMATION SYSTEMS, 2021, 98
[32] Exploring Internal Representations of Deep Neural Networks
Despraz, Jeremie
Gomez, Stephane
Satizabal, Hector F.
Pena-Reyes, Carlos Andres
COMPUTATIONAL INTELLIGENCE, IJCCI 2017, 2019, 829 : 119 - 138
[33] Rotation Invariance-based Deep Learning SLAM for Robots
Deng, Zhenping
2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 7442 - 7447
[34] Robust visual tracking based on scale invariance and deep learning
Ren, Nan
Du, Junping
Zhu, Suguo
Li, Linghui
Fan, Dan
Lee, JangMyung
FRONTIERS OF COMPUTER SCIENCE, 2017, 11 (02) : 230 - 242
[35] Deep Clustering With a Constraint for Topological Invariance Based on Symmetric InfoNCE
Zhang, Yuhui
Wada, Yuichiro
Waida, Hiroki
Goto, Kaito
Hino, Yusaku
Kanamori, Takafumi
NEURAL COMPUTATION, 2023, 35 (07) : 1288 - 1339
[36] Robust visual tracking based on scale invariance and deep learning
Nan Ren
Junping Du
Suguo Zhu
Linghui Li
Dan Fan
JangMyung Lee
Frontiers of Computer Science, 2017, 11 : 230 - 242
[37] Densely Connected G-invariant Deep Neural Networks with Signed Permutation Representations
Agrawal, Devanshu
Ostrowski, James
JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
[38] Integrating QSAR modelling and deep learning in drug discovery: the emergence of deep QSAR
Tropsha, Alexander
Isayev, Olexandr
Varnek, Alexandre
Schneider, Gisbert
Cherkasov, Artem
NATURE REVIEWS DRUG DISCOVERY, 2024, 23 (02) : 141 - 155
[39] Deep learning-based multi-modal computing with feature disentanglement for MRI image synthesis
Fei, Yuchen
Zhan, Bo
Hong, Mei
Wu, Xi
Zhou, Jiliu
Wang, Yan
MEDICAL PHYSICS, 2021, 48 (07) : 3778 - 3789
[40] A Letter is a Letter and its Co-Occurrences: Cracking the Emergence of Position-Invariance Processing
Fernandez-Lopez, Maria
Perea, Manuel
PSYCHONOMIC BULLETIN & REVIEW, 2023, 30 (06) : 2328 - 2337

← 1 2 3 4 5 →