Emergence of Invariance and Disentanglement in Deep Representations

被引:0
|
作者
Achille, Alessandro [1 ]
Soatto, Stefano [1 ]
机构
[1] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90095 USA
来源
2018 INFORMATION THEORY AND APPLICATIONS WORKSHOP (ITA) | 2018年
关键词
Deep learning; neural network; representation; flat minima; information bottleneck; overfitting; generalization; sufficiency; minimality; sensitivity; information complexity; stochastic gradient descent; regularization; total correlation; PAC-Bayes;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Using established principles from Information Theory and Statistics, we show that in a deep neural network invariance to nuisance factors is equivalent to information minimality of the learned representation, and that stacking layers and injecting noise during training naturally bias the network towards learning invariant representations. We then show that, in order to avoid memorization, we need to limit the quantity of information stored in the weights, which leads to a novel usage of the Information Bottleneck Lagrangian on the weights as a learning criterion. This also has an alternative interpretation as minimizing a PAC-Bayesian bound on the test error. Finally, we exploit a duality between weights and activations induced by the architecture, to show that the information in the weights bounds the minimality and Total Correlation of the layers, therefore showing that regularizing the weights explicitly or implicitly, using SGD, not only helps avoid overfitting, but also fosters invariance and disentangling of the learned representation. The theory also enables predicting sharp phase transitions between underfitting and overfitting random labels at precise information values, and sheds light on the relation between the geometry of the loss function, in particular so-called "flat minima," and generalization.
引用
收藏
页数:29
相关论文
共 50 条
  • [1] Emergence of Invariance and Disentanglement in Deep Representations
    Achille, Alessandro
    Soatto, Stefano
    JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 19
  • [2] Group Invariance, Stability to Deformations, and Complexity of Deep Convolutional Representations
    Bietti, Alberto
    Mairal, Julien
    JOURNAL OF MACHINE LEARNING RESEARCH, 2019, 20
  • [3] THE EMERGENCE OF REPRESENTATIONS IN A TREATMENT
    BAUDUIN, A
    REVUE FRANCAISE DE PSYCHANALYSE, 1992, 56 (01): : 175 - 191
  • [4] Why Deep Learning Works: A Manifold Disentanglement Perspective
    Brahma, Pratik Prabhanjan
    Wu, Dapeng
    She, Yiyuan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (10) : 1997 - 2008
  • [5] Deep Clustering With Sample-Assignment Invariance Prior
    Peng, Xi
    Zhu, Hongyuan
    Feng, Jiashi
    Shen, Chunhua
    Zhang, Haixian
    Zhou, Joey Tianyi
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) : 4857 - 4868
  • [6] The Emergence of Autonomous Representations in Artificial Agents
    Arnellos, Argyris
    Vosinakis, Spyros
    Spyrou, Thomas
    Darzentas, John
    JOURNAL OF COMPUTERS, 2006, 1 (06) : 29 - 36
  • [7] Statistical Characteristics of Deep Representations: An Empirical Investigation
    Choi, Daeyoung
    Lee, Kyungeun
    Hwang, Duhun
    Rhee, Wonjong
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2021, PT V, 2021, 12895 : 43 - 55
  • [8] Flow time history deep learning for feature decomposition and disentanglement
    Zhan, Qingliang
    Liu, Xin
    Bai, Chunjin
    Chao, Yang
    Bao, Dongming
    Wang, Zhiyong
    Sun, Xiannian
    PHYSICA D-NONLINEAR PHENOMENA, 2025, 472
  • [9] DEEP FEATURE DISENTANGLEMENT LEARNING FOR BONE SUPPRESSION IN CHEST RADIOGRAPHS
    Lin, Chunze
    Tang, Ruixiang
    Lin, Darryl D.
    Liu, Langechuan
    Lu, Jiwen
    Chen, Yunqiang
    Gao, Dashan
    Zhou, Jie
    2020 IEEE 17TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2020), 2020, : 795 - 798
  • [10] A DEEP REPRESENTATION FOR INVARIANCE AND MUSIC CLASSIFICATION
    Zhang, Chiyuan
    Evangelopoulos, Georgios
    Voinea, Stephen
    Rosasco, Lorenzo
    Poggio, Tomaso
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,