Training Compact DNNs with l1/2 Regularization

被引：3

作者：

Tang, Anda ^{[1
]}

Niu, Lingfeng ^{[2
,3
]}

Miao, Jianyu ^{[4
]}

Zhang, Peng ^{[5
]}

机构：

[1] Univ Chinese Acad Sci, Sch Math Sci, Beijing 100190, Peoples R China

[2] Chinese Acad Sci, Res Ctr Fictitious Econ & Data Sci, Beijing 100190, Peoples R China

[3] Univ Chinese Acad Sci, Sch Econ & Management, Beijing 100190, Peoples R China

[4] Henan Univ Technol, Sch Artificial Intelligence & Big Data, Zhengzhou 450001, Peoples R China

[5] Guangzhou Univ, Cyberspace Inst Adv Technol, Guangzhou 511442, Peoples R China

来源：

PATTERN RECOGNITION | 2023年 / 136卷

基金：

中国国家自然科学基金;

关键词：

Deep neural networks; Model compression; 1; 2; Quasi-norm; Non-Lipschitz regularization; Sparse optimization; L-1/2; REGULARIZATION; VARIABLE SELECTION; NEURAL-NETWORKS; REPRESENTATION; MINIMIZATION; DROPOUT; MODEL;

D O I：

10.1016/j.patcog.2022.109206

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep neural network(DNN) has achieved unprecedented success in many fields. However, its large model parameters which bring a great burden on storage and calculation hinder the development and appli-cation of DNNs. It is worthy of compressing the model to reduce the complexity of the DNN. Sparsity -inducing regularizer is one of the most common tools for compression. In this paper, we propose utilizing the pound 1 / 2 quasi-norm to zero out weights of neural networks and compressing the networks automatically during the learning process. To our knowledge, it is the first work applying the non-Lipschitz contin-uous regularizer for the compression of DNNs. The resulting sparse optimization problem is solved by stochastic proximal gradient algorithm. For further convenience of calculation, an approximation of the threshold-form solution to the proximal operator with pound 1 / 2 is given at the same time. Extensive experi-ments with various datasets and baselines demonstrate the advantages of our new method.(c) 2022 Elsevier Ltd. All rights reserved.

引用

页数：12

共 49 条

[1]

Aghasi A, 2017, ADV NEUR IN, V30

[2]

Alvarez JM, 2016, ADV NEUR IN, V29

[3]

[Anonymous], 2006, Advances in neural information processing systems

[4]

[Anonymous], 1989, Advances in neural information processing systems

[5]

Aslan Ozlem., 2014, ADV NEURAL INF PROCE, P3275

[6] Image Super-Resolution via Adaptive lp(0 < p < 1) Regularization and Sparse Representation [J].

Cao, Feilong ;

Cai, Miaomiao ;

Tan, Yuanpeng ;

Zhao, Jianwei .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (07) :1550-1561

[7] Iteratively reweighted algorithms for compressive sensing [J].

Chartrand, Rick ;

Yin, Wotao .

2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, :3869-+

[8] Restricted isometry properties and nonconvex compressive sensing [J].

Chartrand, Rick ;

Staneva, Valentina .

INVERSE PROBLEMS, 2008, 24 (03)

[9] FAST ALGORITHMS FOR NONCONVEX COMPRESSIVE SENSING: MRI RECONSTRUCTION FROM VERY FEW DATA [J].

Chartrand, Rick .

2009 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING: FROM NANO TO MACRO, VOLS 1 AND 2, 2009, :262-265

[10] Non-Lipschitz lp-Regularization and Box Constrained Model for Image Restoration [J].

Chen, Xiaojun ;

Ng, Michael K. ;

Zhang, Chao .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2012, 21 (12) :4709-4721

← 1 2 3 4 5 →