Emergence of Invariance and Disentanglement in Deep Representations

被引：0

作者：

Achille, Alessandro ^{[1
]}

Soatto, Stefano ^{[1
]}

机构：

[1] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90095 USA

来源：

2018 INFORMATION THEORY AND APPLICATIONS WORKSHOP (ITA) | 2018年

关键词：

Deep learning; neural network; representation; flat minima; information bottleneck; overfitting; generalization; sufficiency; minimality; sensitivity; information complexity; stochastic gradient descent; regularization; total correlation; PAC-Bayes;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Using established principles from Information Theory and Statistics, we show that in a deep neural network invariance to nuisance factors is equivalent to information minimality of the learned representation, and that stacking layers and injecting noise during training naturally bias the network towards learning invariant representations. We then show that, in order to avoid memorization, we need to limit the quantity of information stored in the weights, which leads to a novel usage of the Information Bottleneck Lagrangian on the weights as a learning criterion. This also has an alternative interpretation as minimizing a PAC-Bayesian bound on the test error. Finally, we exploit a duality between weights and activations induced by the architecture, to show that the information in the weights bounds the minimality and Total Correlation of the layers, therefore showing that regularizing the weights explicitly or implicitly, using SGD, not only helps avoid overfitting, but also fosters invariance and disentangling of the learned representation. The theory also enables predicting sharp phase transitions between underfitting and overfitting random labels at precise information values, and sheds light on the relation between the geometry of the loss function, in particular so-called "flat minima," and generalization.

引用

页数：29

共 50 条

[41] Automatic Image Annotation using Deep Learning Representations
Murthy, Venkatesh N.
Maji, Subhransu
Manmatha, R.
ICMR'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2015, : 603 - 606
[42] Learning Deep Pyramid-based Representations for Pansharpening
Adeel, Hannan
Ali, Syed Sohaib
Riaz, Muhammad Mohsin
Kirmani, Syed Abdul Mannan
Qureshi, Muhammad Imran
Imtiaz, Junaid
ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2022, 47 (08) : 10655 - 10666
[43] Learning social representations with deep autoencoder for recommender system
Yiteng Pan
Fazhi He
Haiping Yu
World Wide Web, 2020, 23 : 2259 - 2279
[44] Hippocampal representations for deep learning on Alzheimer's disease
Sarasua, Ignacio
Polsterl, Sebastian
Wachinger, Christian
SCIENTIFIC REPORTS, 2022, 12 (01)
[45] Face clustering using a weighted combination of deep representations
Skiadopoulou, Dafni
Likas, Aristidis
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (02) : 995 - 1006
[46] Learning deep representations via extreme learning machines
Yu, Wenchao
Zhuang, Fuzhen
He, Qing
Shi, Zhongzhi
NEUROCOMPUTING, 2015, 149 : 308 - 315
[47] Learning Deep Pyramid-based Representations for Pansharpening
Hannan Adeel
Syed Sohaib Ali
Muhammad Mohsin Riaz
Syed Abdul Mannan Kirmani
Muhammad Imran Qureshi
Junaid Imtiaz
Arabian Journal for Science and Engineering, 2022, 47 : 10655 - 10666
[48] Audio representations for deep learning in sound synthesis: A review
Natsiou, Anastasia
O'Leary, Sean
2021 IEEE/ACS 18TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2021,
[49] Deep Set Conditioned Latent Representations for Action Recognition
Singh, Akash
de Schepper, Tom
Mets, Kevin
Hellinckx, Peter
Oramas, Jose
Latre, Steven
PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2022, : 456 - 466
[50] Towards Deep Anomaly Detection with Structured Knowledge Representations
Kirchheim, Konstantin
COMPUTER SAFETY, RELIABILITY, AND SECURITY, SAFECOMP 2023 WORKSHOPS, 2023, 14182 : 382 - 389

← 1 2 3 4 5 →