Neural Network Compression for Noisy Storage Devices

被引:3
作者
Isik, Berivan [1 ]
Choi, Kristy [2 ]
Zheng, Xin [3 ]
Weissman, Tsachy [1 ]
Ermon, Stefano [2 ]
Wong, H. -S. Philip [3 ]
Alaghi, Armin [4 ]
机构
[1] Stanford Univ, 350 Jane Stanford Way, Stanford, CA 94305 USA
[2] Stanford Univ, 353 Jane Stanford Way, Stanford, CA 94305 USA
[3] Stanford Univ, 330 Jane Stanford Way, Stanford, CA USA
[4] Meta Real Labs, 9845 Willows Rd NE, Redmond, WA 98052 USA
关键词
Neural networks; robustness; compression; analog storage; PCM; MODEL COMPRESSION; ACCELERATION; ALGORITHM;
D O I
10.1145/3588436
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Compression and efficient storage of neural network (NN) parameters is critical for applications that run on resource-constrained devices. Despite the significant progress in NN model compression, there has been considerably less investigation in the actual physical storage of NN parameters. Conventionally, model compression and physical storage are decoupled, as digital storage media with error-correcting codes (ECCs) provide robust error-free storage. However, this decoupled approach is inefficient as it ignores the overparameterization present in most NNs and forces the memory device to allocate the same amount of resources to every bit of information regardless of its importance. In this work, we investigate analog memory devices as an alternative to digital media - one that naturally provides a way to add more protection for significant bits unlike its counterpart, but is noisy and may compromise the stored model's performance if used naively. We develop a variety of robust coding strategies for NN weight storage on analog devices, and propose an approach to jointly optimize model compression and memory resource allocation. We then demonstrate the efficacy of our approach on models trained on MNIST, CIFAR-10, and ImageNet datasets for existing compression techniques. Compared to conventional error-free digital storage, our method reduces the memory footprint by up to one order of magnitude, without significantly compromising the stored model's accuracy.
引用
收藏
页数:29
相关论文
共 76 条
[41]  
Khoram Soroosh, 2018, INT C LEARNING REPRE
[42]  
Krizhevsky A., 2009, CIFAR-100 Dataset
[43]   DNR: A Tunable Robust Pruning Framework Through Dynamic Network Rewiring of DNNs [J].
Kundu, Souvik ;
Nazemi, Mahdi ;
Beerel, Peter A. ;
Pedram, Massoud .
2021 26TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2021, :344-350
[44]   Gradient-based learning applied to document recognition [J].
Lecun, Y ;
Bottou, L ;
Bengio, Y ;
Haffner, P .
PROCEEDINGS OF THE IEEE, 1998, 86 (11) :2278-2324
[45]  
LeCun Y., 1989, Advances in Neural Information Processing Systems, V2
[46]   Deep learning [J].
LeCun, Yann ;
Bengio, Yoshua ;
Hinton, Geoffrey .
NATURE, 2015, 521 (7553) :436-444
[47]  
Lee NM, 2019, Arxiv, DOI arXiv:1810.02340
[48]  
Louizos C, 2017, Arxiv, DOI arXiv:1705.08665
[49]  
Martens J, 2020, Arxiv, DOI arXiv:1412.1193
[50]   NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis [J].
Ben Mildenhall ;
Srinivasan, Pratul P. ;
Tancik, Matthew ;
Barron, Jonathan T. ;
Ramamoorthi, Ravi ;
Ng, Ren .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :405-421