Reusing Convolutional Neural Network Models through Modularization and Composition

被引:4
作者
Qi, Binhang [1 ]
Sun, Hailong [2 ]
Zhang, Hongyu [3 ]
Gao, Xiang [4 ]
机构
[1] Beihang Univ, Sch Comp Sci & Engn, SKLSDE, Xueyuan Rd,Haidian Dist 37, Beijing 100191, Peoples R China
[2] Beihang Univ, Sch Software, SKLSDE, Xueyuan Rd,Haidian Dist 37, Beijing 100191, Peoples R China
[3] Chongqing Univ, Sch Big Data Software Engn, 55 Univ Town South Rd, Chongqing 401331, Peoples R China
[4] Beihang Univ, Sch Software, Xueyuan Rd,Haidian Dist 37, Beijing 100191, Peoples R China
基金
中国国家自然科学基金;
关键词
Model reuse; convolutional neural network; CNN modularization; module composition;
D O I
10.1145/3632744
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
With the widespread success of deep learning technologies, many trained deep neural network (DNN) models are now publicly available. However, directly reusing the public DNN models for new tasks often fails due to mismatching functionality or performance. Inspired by the notion of modularization and composition in software reuse, we investigate the possibility of improving the reusability of DNN models in a more fine-grained manner. Specifically, we propose two modularization approaches named CNNSplitter and GradSplitter, which can decompose a trained convolutional neural network (CNN) model for N-class classification into N small reusable modules. Each module recognizes one of the N classes and contains a part of the convolution kernels of the trained CNN model. Then, the resulting modules can be reused to patch existing CNN models or build new CNN models through composition. The main difference between CNNSplitter and GradSplitter lies in their search methods: the former relies on a genetic algorithm to explore search space, while the latter utilizes a gradient-based search method. Our experiments with three representative CNNs on three widely used public datasets demonstrate the effectiveness of the proposed approaches. Compared with CNNSplitter, GradSplitter incurs less accuracy loss, produces much smaller modules (19.88% fewer kernels), and achieves better results on patching weak models. In particular, experiments on GradSplitter show that (1) by patching weak models, the average improvement in terms of precision, recall, and F1-score is 17.13%, 4.95%, and 11.47%, respectively, and (2) for a new task, compared with the models trained from scratch, reusing modules achieves similar accuracy (the average loss of accuracy is only 2.46%) without a costly training process. Our approaches provide a viable solution to the rapid development and improvement of CNN models.
引用
收藏
页数:39
相关论文
共 50 条
[41]   Composition analysis of white mineral pigment based on convolutional neural network and Raman spectrum [J].
Qi, Wenbo ;
Mu, Taotao ;
Chen, Shaohua ;
Wang, Yao .
JOURNAL OF RAMAN SPECTROSCOPY, 2022, 53 (04) :746-754
[42]   Solving multiple travelling salesman problem through deep convolutional neural network [J].
Ling, Zhengxuan ;
Zhou, Yueling ;
Zhang, Yu .
IET CYBER-SYSTEMS AND ROBOTICS, 2023, 5 (01)
[43]   Analysis of Pre-trained Convolutional Neural Network Models in Diabetic Macular Edema Detection Through Retinal Fundus Images [J].
Araque-Gallardo, Jose ;
Arrieta Rodriguez, Eugenia ;
Gamarra, Margarita ;
Sierra-Carrillo, Javier ;
Escorcia-Gutierrez, Jose .
ADVANCES IN COMPUTING, CCC 2024, PT I, 2024, 2208 :117-131
[44]   A Comparative Study of Convolutional Neural Network and Recurrent Neural Network Models for the Analysis of Cardiac Arrest Rhythms During Cardiopulmonary Resuscitation [J].
Lee, Sijin ;
Lee, Kwang-Sig ;
Park, Hyun-Joon ;
Han, Kap Su ;
Song, Juhyun ;
Lee, Sung Woo ;
Kim, Su Jin .
APPLIED SCIENCES-BASEL, 2025, 15 (08)
[45]   WETLAND MAPPING BY JOINTLY USE OF CONVOLUTIONAL NEURAL NETWORK AND GRAPH CONVOLUTIONAL NETWORK [J].
Jafarzadeh, Hamid ;
Mahdianpari, Masoud ;
Gill, Eric .
2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, :2219-2222
[46]   Leukocyte recognition with convolutional neural network [J].
Lin, Liqun ;
Wang, Weixing ;
Chen, Bolin .
JOURNAL OF ALGORITHMS & COMPUTATIONAL TECHNOLOGY, 2018, 13 :1-8
[47]   A-optimal convolutional neural network [J].
Zihong Yin ;
Dehui Kong ;
Guoxia Shao ;
Xinran Ning ;
Warren Jin ;
Jing-Yan Wang .
Neural Computing and Applications, 2018, 30 :2295-2304
[48]   Tweet Classification with Convolutional Neural Network [J].
Kolekar, Santosh Shivaji ;
Khanuja, H. K. .
2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2018,
[49]   Prediction of diffusional conductance in extracted pore network models using convolutional neural networks [J].
Misaghian, Niloo ;
Agnaou, Mehrez ;
Sadeghi, Mohammad Amin ;
Fathiannasab, Hamed ;
Hadji, Isma ;
Roberts, Edward ;
Gostick, Jeff .
COMPUTERS & GEOSCIENCES, 2022, 162
[50]   A Convolutional Neural Network for Clickbait Detection [J].
Fu, Junfeng ;
Liang, Liang ;
Zhou, Xin ;
Zheng, Jinkun .
2017 4TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE), 2017, :6-10