Remember the Past: Distilling Datasets into Addressable Memories for Neural Networks

被引：0

作者：

Deng, Zhiwei ^{[1
]}

Russakovsky, Olga ^{[1
]}

机构：

[1] Princeton Univ, Dept Comp Sci, Princeton, NJ 08544 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022) | 2022年

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose an algorithm that compresses the critical information of a large dataset into compact addressable memories. These memories can then be recalled to quickly re-train a neural network and recover the performance (instead of storing and re-training on the full original dataset). Building upon the dataset distillation framework, we make a key observation that a shared common representation allows for more efficient and effective distillation. Concretely, we learn a set of bases (aka "memories") which are shared between classes and combined through learned flexible addressing functions to generate a diverse set of training examples. This leads to several benefits: 1) the size of compressed data does not necessarily grow linearly with the number of classes; 2) an overall higher compression rate with more effective distillation is achieved; and 3) more generalized queries are allowed beyond recalling the original classes. We demonstrate state-of-the-art results on the dataset distillation task across six benchmarks, including up to 16.5% and 9.7% in retained accuracy improvement when distilling CIFAR10 and CIFAR100 respectively. We then leverage our framework to perform continual learning, achieving state-of-the-art results on four benchmarks, with 23.2% accuracy improvement on MANY. The code is released on our project webpage(1).

引用

页数：14

共 50 条

[1] NEURAL NETWORKS AS CONTENT ADDRESSABLE MEMORIES AND LEARNING MACHINES
KOBERLE, R
COMPUTER PHYSICS COMMUNICATIONS, 1989, 56 (01) : 43 - 50
[2] Distilling Knowledge for Non-Neural Networks
Fukui, Shota
Yu, Jaehoon
Hashimoto, Masanori
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1411 - 1416
[3] Distilling Holistic Knowledge with Graph Neural Networks
Zhou, Sheng
Wang, Yucheng
Chen, Defang
Chen, Jiawei
Wang, Xin
Wang, Can
Bu, Jiajun
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10367 - 10376
[4] Pictures to remember. Generational memories about the recent past in Chile
Antezana Barrios, Lorena
Sanchez Sepulveda, Juan Pablo
Silva Moreno, Rocio
IC-REVISTA CIENTIFICA DE INFORMACION Y COMUNICACION, 2020, (17): : 247 - 271
[5] Genetic evolution of neural networks that remember
Dávila, JJ
PROCEEDING OF THE 2002 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-3, 2002, : 1148 - 1153
[6] TERMINAL ATTRACTORS FOR ADDRESSABLE MEMORY IN NEURAL NETWORKS
ZAK, M
PHYSICS LETTERS A, 1988, 133 (1-2) : 18 - 22
[7] Distilling Spikes: Knowledge Distillation in Spiking Neural Networks
Kushawaha, Ravi Kumar
Kumar, Saurabh
Banerjee, Biplab
Velmurugan, Rajbabu
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 4536 - 4543
[8] Distilling Neural Networks for Greener and Faster Dependency Parsing
Anderson, Mark
Gomez-Rodriguez, Carlos
16TH INTERNATIONAL CONFERENCE ON PARSING TECHNOLOGIES AND IWPT 2020 SHARED TASK ON PARSING INTO ENHANCED UNIVERSAL DEPENDENCIES, 2020, : 2 - 13
[9] On neural networks that design neural associative memories
Chan, HY
Zak, SH
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1997, 8 (02): : 360 - 372
[10] How Convolutional Neural Networks Remember Art
Cetinic, Eva
Lipic, Tomislav
Grgic, Sonja
2018 25TH INTERNATIONAL CONFERENCE ON SYSTEMS, SIGNALS AND IMAGE PROCESSING (IWSSIP), 2018,

← 1 2 3 4 5 →