QLP: Deep Q-Learning for Pruning Deep Neural Networks

被引：12

作者：

Camci, Efe ^{[1
]}

Gupta, Manas ^{[1
]}

Wu, Min ^{[1
]}

Lin, Jie ^{[1
]}

机构：

[1] ASTAR, Inst Infocomm Res I2R, Singapore 138632, Singapore

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2022年 / 32卷 / 10期

关键词：

Training; Neural networks; Indexes; Computer architecture; Deep learning; Biological neural networks; Task analysis; Deep neural network compression; pruning; deep reinforcement learning; MODEL COMPRESSION; SPARSITY;

D O I：

10.1109/TCSVT.2022.3167951

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

We present a novel, deep Q-learning based method, QLP, for pruning deep neural networks (DNNs). Given a DNN, our method intelligently determines favorable layer-wise sparsity ratios, which are then implemented via unstructured, magnitude-based, weight pruning. In contrast to previous reinforcement learning (RL) based pruning methods, our method is not forced to prune a DNN within a single, sequential pass from the first layer to the last. It visits each layer multiple times and prunes them little by little at each visit, achieving superior granular pruning. Moreover, our method is not restricted to a subset of actions within the feasible action space. It has the flexibility to execute a whole range of sparsity ratios (0% - 100%) for each layer. This enables aggressive pruning without compromising accuracy. Furthermore, our method does not require a complex state definition; it features a simple, generic definition that is composed of only the index and the density of the layers, which leads to less computational demand while observing the state at each interaction. Lastly, our method utilizes a carefully designed curriculum that enables learning targeted policies for each sparsity regime, which helps to deliver better accuracy, especially at high sparsity levels. We conduct batched performance tests at compelling sparsity levels (up to 98%), present extensive ablation studies to justify our RL-related design choices, and compare our method with the state-of-the-art, including RL-based and other pruning methods. Our method sets the new state-of-the-art results in most of the experiments with ResNet-32 and ResNet-56 over CIFAR-10 dataset as well as ResNet-50 and MobileNet-v1 over ILSVRC2012 (ImageNet) dataset.

引用

页码：6488 / 6501

页数：14

共 50 条

[1]

[Anonymous], 1988, P 1 INT C NEUR INF P

[2]

Bengio Y., 2009, P 26 ANN INT C MACH, P41, DOI DOI 10.1145/1553374.1553380

[3]

Blalock D, 2020, Arxiv, DOI arXiv:2003.03033

[4] Autonomous aerial cinematography in unstructured environments with learned artistic decision-making [J].

Bonatti, Rogerio ;

Wang, Wenshan ;

Ho, Cherie ;

Ahuja, Aayush ;

Gschwindt, Mirko ;

Camci, Efe ;

Kayacan, Erdal ;

Choudhury, Sanjiban ;

Scherer, Sebastian .

JOURNAL OF FIELD ROBOTICS, 2020, 37 (04) :606-641

[5]

Camci E, 2019, Arxiv, DOI arXiv:1909.13599

[6] Deep Reinforcement Learning for Motion Planning of Quadrotors Using Raw Depth Images [J].

Camci, Efe ;

Campolo, Domenico ;

Kayacan, Erdal .

2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,

[7] "Learning-Compression" Algorithms for Neural Net Pruning [J].

Carreira-Perpinan, Miguel A. ;

Idelbayev, Yerlan .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :8532-8541

[8]

Chen J., 2020, P ADV NEUR INF PROC, V33, P14747

[9]

Choi J., 2012, P ADV NEUR INF PROC, V25, P1

[10] DECOMPOSITIONS OF A HIGHER-ORDER TENSOR IN BLOCK TERMS-PART II: DEFINITIONS AND UNIQUENESS [J].

De Lathauwer, Lieven .

SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 2008, 30 (03) :1033-1066

← 1 2 3 4 5 →