Training Compact DNNs with l1/2 Regularization

被引:3
作者
Tang, Anda [1 ]
Niu, Lingfeng [2 ,3 ]
Miao, Jianyu [4 ]
Zhang, Peng [5 ]
机构
[1] Univ Chinese Acad Sci, Sch Math Sci, Beijing 100190, Peoples R China
[2] Chinese Acad Sci, Res Ctr Fictitious Econ & Data Sci, Beijing 100190, Peoples R China
[3] Univ Chinese Acad Sci, Sch Econ & Management, Beijing 100190, Peoples R China
[4] Henan Univ Technol, Sch Artificial Intelligence & Big Data, Zhengzhou 450001, Peoples R China
[5] Guangzhou Univ, Cyberspace Inst Adv Technol, Guangzhou 511442, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep neural networks; Model compression; 1; 2; Quasi-norm; Non-Lipschitz regularization; Sparse optimization; L-1/2; REGULARIZATION; VARIABLE SELECTION; NEURAL-NETWORKS; REPRESENTATION; MINIMIZATION; DROPOUT; MODEL;
D O I
10.1016/j.patcog.2022.109206
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural network(DNN) has achieved unprecedented success in many fields. However, its large model parameters which bring a great burden on storage and calculation hinder the development and appli-cation of DNNs. It is worthy of compressing the model to reduce the complexity of the DNN. Sparsity -inducing regularizer is one of the most common tools for compression. In this paper, we propose utilizing the pound 1 / 2 quasi-norm to zero out weights of neural networks and compressing the networks automatically during the learning process. To our knowledge, it is the first work applying the non-Lipschitz contin-uous regularizer for the compression of DNNs. The resulting sparse optimization problem is solved by stochastic proximal gradient algorithm. For further convenience of calculation, an approximation of the threshold-form solution to the proximal operator with pound 1 / 2 is given at the same time. Extensive experi-ments with various datasets and baselines demonstrate the advantages of our new method.(c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:12
相关论文
共 49 条
[1]  
Aghasi A, 2017, ADV NEUR IN, V30
[2]  
Alvarez JM, 2016, ADV NEUR IN, V29
[3]  
[Anonymous], 2006, Advances in neural information processing systems
[4]  
[Anonymous], 1989, Advances in neural information processing systems
[5]  
Aslan Ozlem., 2014, ADV NEURAL INF PROCE, P3275
[6]   Image Super-Resolution via Adaptive lp(0 < p < 1) Regularization and Sparse Representation [J].
Cao, Feilong ;
Cai, Miaomiao ;
Tan, Yuanpeng ;
Zhao, Jianwei .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (07) :1550-1561
[7]   Iteratively reweighted algorithms for compressive sensing [J].
Chartrand, Rick ;
Yin, Wotao .
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, :3869-+
[8]   Restricted isometry properties and nonconvex compressive sensing [J].
Chartrand, Rick ;
Staneva, Valentina .
INVERSE PROBLEMS, 2008, 24 (03)
[9]   FAST ALGORITHMS FOR NONCONVEX COMPRESSIVE SENSING: MRI RECONSTRUCTION FROM VERY FEW DATA [J].
Chartrand, Rick .
2009 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING: FROM NANO TO MACRO, VOLS 1 AND 2, 2009, :262-265
[10]   Non-Lipschitz lp-Regularization and Box Constrained Model for Image Restoration [J].
Chen, Xiaojun ;
Ng, Michael K. ;
Zhang, Chao .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2012, 21 (12) :4709-4721