TURBO: The Swiss Knife of Auto-Encoders

被引：3

作者：

Quetant, Guillaume ^{[1
]}

Belousov, Yury ^{[1
]}

Kinakh, Vitaliy ^{[1
]}

Voloshynovskiy, Slava ^{[1
]}

机构：

[1] Univ Geneva, Ctr Univ Informat, Route Drize 7, CH-1227 Carouge, Switzerland

来源：

ENTROPY | 2023年 / 25卷 / 10期

关键词：

information bottleneck; TURBO; generalisation; auto-encoder; variational approximation; lower bound; mutual information; physical latent space; representations; Kullback-Leibler divergence;

D O I：

10.3390/e25101471

中图分类号：

O4 [物理学];

学科分类号：

0702 ;

摘要：

We present a novel information-theoretic framework, termed as TURBO, designed to systematically analyse and generalise auto-encoding methods. We start by examining the principles of information bottleneck and bottleneck-based networks in the auto-encoding setting and identifying their inherent limitations, which become more prominent for data with multiple relevant, physics-related representations. The TURBO framework is then introduced, providing a comprehensive derivation of its core concept consisting of the maximisation of mutual information between various data representations expressed in two directions reflecting the information flows. We illustrate that numerous prevalent neural network models are encompassed within this framework. The paper underscores the insufficiency of the information bottleneck concept in elucidating all such models, thereby establishing TURBO as a preferable theoretical reference. The introduction of TURBO contributes to a richer understanding of data representation and the structure of neural network models, enabling more efficient and versatile applications.

引用

页数：29

共 37 条

[1] Information Dropout: Learning Optimal Representations Through Noisy Computation [J].

Achille, Alessandro ;

Soatto, Stefano .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (12) :2897-2905

[2]

Alemi A. A., 2016, P INT C LEARN REPR T

[3] Learning Representations for Neural Network-Based Classification Using the Information Bottleneck Principle [J].

Amjad, Rana Ali ;

Geiger, Bernhard C. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (09) :2225-2239

[4]

Arjovsky M, 2017, PR MACH LEARN RES, V70

[5] Invertible networks or partons to detector and back again [J].

Bellagente, Marco ;

Butter, Anja ;

Kasieczka, Gregor ;

Plehn, Tilman ;

Rousselot, Armand ;

Winterhalder, Ramon ;

Ardizzone, Lynton ;

Koethe, Ullrich .

SCIPOST PHYSICS, 2020, 9 (05)

[6]

Belousov Y., 2022, P IEEE INT WORKSH IN, P1

[7]

Brock A, 2019, Arxiv, DOI arXiv:1809.11096

[8] Getting High: High Fidelity Simulation of High Granularity Calorimeters with High Speed [J].

Buhmann E. ;

Diefenbacher S. ;

Eren E. ;

Gaede F. ;

Kasieczka G. ;

Korol A. ;

Krüger K. .

Computing and Software for Big Science, 2021, 5 (1)

[9] Hadrons, better, faster, stronger [J].

Buhmann, Erik ;

Diefenbacher, Sascha ;

Hundhausen, Daniel ;

Kasieczka, Gregor ;

Korcari, William ;

Eren, Engin ;

Gaede, Frank ;

Krueger, Katja ;

McKeown, Peter ;

Rustige, Lennart .

MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2022, 3 (02)

[10]

Cover T. A., 2006, Elements of information theory, V2nd

← 1 2 3 4 →