Emergence of Invariance and Disentanglement in Deep Representations

被引:0
|
作者
Achille, Alessandro [1 ]
Soatto, Stefano [1 ]
机构
[1] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90095 USA
来源
2018 INFORMATION THEORY AND APPLICATIONS WORKSHOP (ITA) | 2018年
关键词
Deep learning; neural network; representation; flat minima; information bottleneck; overfitting; generalization; sufficiency; minimality; sensitivity; information complexity; stochastic gradient descent; regularization; total correlation; PAC-Bayes;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Using established principles from Information Theory and Statistics, we show that in a deep neural network invariance to nuisance factors is equivalent to information minimality of the learned representation, and that stacking layers and injecting noise during training naturally bias the network towards learning invariant representations. We then show that, in order to avoid memorization, we need to limit the quantity of information stored in the weights, which leads to a novel usage of the Information Bottleneck Lagrangian on the weights as a learning criterion. This also has an alternative interpretation as minimizing a PAC-Bayesian bound on the test error. Finally, we exploit a duality between weights and activations induced by the architecture, to show that the information in the weights bounds the minimality and Total Correlation of the layers, therefore showing that regularizing the weights explicitly or implicitly, using SGD, not only helps avoid overfitting, but also fosters invariance and disentangling of the learned representation. The theory also enables predicting sharp phase transitions between underfitting and overfitting random labels at precise information values, and sheds light on the relation between the geometry of the loss function, in particular so-called "flat minima," and generalization.
引用
收藏
页数:29
相关论文
共 50 条
  • [41] Automatic Image Annotation using Deep Learning Representations
    Murthy, Venkatesh N.
    Maji, Subhransu
    Manmatha, R.
    ICMR'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2015, : 603 - 606
  • [42] Learning Deep Pyramid-based Representations for Pansharpening
    Adeel, Hannan
    Ali, Syed Sohaib
    Riaz, Muhammad Mohsin
    Kirmani, Syed Abdul Mannan
    Qureshi, Muhammad Imran
    Imtiaz, Junaid
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2022, 47 (08) : 10655 - 10666
  • [43] Learning social representations with deep autoencoder for recommender system
    Yiteng Pan
    Fazhi He
    Haiping Yu
    World Wide Web, 2020, 23 : 2259 - 2279
  • [44] Hippocampal representations for deep learning on Alzheimer's disease
    Sarasua, Ignacio
    Polsterl, Sebastian
    Wachinger, Christian
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [45] Face clustering using a weighted combination of deep representations
    Skiadopoulou, Dafni
    Likas, Aristidis
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (02) : 995 - 1006
  • [46] Learning deep representations via extreme learning machines
    Yu, Wenchao
    Zhuang, Fuzhen
    He, Qing
    Shi, Zhongzhi
    NEUROCOMPUTING, 2015, 149 : 308 - 315
  • [47] Learning Deep Pyramid-based Representations for Pansharpening
    Hannan Adeel
    Syed Sohaib Ali
    Muhammad Mohsin Riaz
    Syed Abdul Mannan Kirmani
    Muhammad Imran Qureshi
    Junaid Imtiaz
    Arabian Journal for Science and Engineering, 2022, 47 : 10655 - 10666
  • [48] Audio representations for deep learning in sound synthesis: A review
    Natsiou, Anastasia
    O'Leary, Sean
    2021 IEEE/ACS 18TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2021,
  • [49] Deep Set Conditioned Latent Representations for Action Recognition
    Singh, Akash
    de Schepper, Tom
    Mets, Kevin
    Hellinckx, Peter
    Oramas, Jose
    Latre, Steven
    PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2022, : 456 - 466
  • [50] Towards Deep Anomaly Detection with Structured Knowledge Representations
    Kirchheim, Konstantin
    COMPUTER SAFETY, RELIABILITY, AND SECURITY, SAFECOMP 2023 WORKSHOPS, 2023, 14182 : 382 - 389