REx: Data-Free Residual Quantization Error Expansion

被引：0

作者：

Yvinec, Edouard ^{[1
,2
]}

Dapogny, Arnaud ^{[2
]}

Cord, Matthieu ^{[1
]}

Bailly, Kevin ^{[1
,2
]}

机构：

[1] Sorbonne Univ, CNRS, ISIR, 4 Pl Jussieu, F-75005 Paris, France

[2] Datakalab, 114 Blvd Malesherbes, F-75017 Paris, France

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep neural networks (DNNs) are ubiquitous in computer vision and natural language processing, but suffer from high inference cost. This problem can be addressed by quantization, which consists in converting floating point operations into a lower bit-width format. With the growing concerns on privacy rights, we focus our efforts on data-free methods. However, such techniques suffer from their lack of adaptability to the target devices, as a hardware typically only supports specific bit widths. Thus, to adapt to a variety of devices, a quantization method shall be flexible enough to find good accuracy v.s. speed trade-offs for every bit width and target device. To achieve this, we propose REx, a quantization method that leverages residual error expansion, along with group sparsity. We show experimentally that REx enables better trade-offs (in terms of accuracy given any target bit-width) on both convnets and transformers for computer vision, as well as NLP models. In particular, when applied to large language models, we show that REx elegantly solves the outlier problem that hinders state-of-the-art quantization methods. In addition, REx is backed off by strong theoretical guarantees on the preservation of the predictive function of the original model. Lastly, we show that REx is agnostic to the quantization operator and can be used in combination with previous quantization work.

引用

页数：12

共 56 条

[1]

Achterhold J, 2018, ICLR

[2]

[Anonymous], 2019, ICML

[3]

[Anonymous], 1989, FUNDAMENTAL PAPERS W

[4]

[Anonymous], 2021, CVPR, DOI DOI 10.1109/CVPR46437.2021.01540

[5]

[Anonymous], 2022, CVPR, DOI DOI 10.1109/CVPR52688.2022.00813

[6]

[Anonymous], 2022, CVPR, DOI DOI 10.1109/CVPR52688.2022.01202

[7]

[Anonymous], 2018, INT C LEARN REPR, DOI DOI 10.1109/CISS.2018.8362280

[8]

Bailly K., 2023, ICLR

[9]

Bisk Y, 2020, AAAI CONF ARTIF INTE, V34, P7432

[10] ZeroQ: A Novel Zero Shot Quantization Framework [J].

Cai, Yaohui ;

Yao, Zhewei ;

Dong, Zhen ;

Gholami, Amir ;

Mahoney, Michael W. ;

Keutzer, Kurt .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :13166-13175

← 1 2 3 4 5 6 →