Reusing Convolutional Neural Network Models through Modularization and Composition

被引:2
作者
Qi, Binhang [1 ]
Sun, Hailong [2 ]
Zhang, Hongyu [3 ]
Gao, Xiang [4 ]
机构
[1] Beihang Univ, Sch Comp Sci & Engn, SKLSDE, Xueyuan Rd,Haidian Dist 37, Beijing 100191, Peoples R China
[2] Beihang Univ, Sch Software, SKLSDE, Xueyuan Rd,Haidian Dist 37, Beijing 100191, Peoples R China
[3] Chongqing Univ, Sch Big Data Software Engn, 55 Univ Town South Rd, Chongqing 401331, Peoples R China
[4] Beihang Univ, Sch Software, Xueyuan Rd,Haidian Dist 37, Beijing 100191, Peoples R China
基金
中国国家自然科学基金;
关键词
Model reuse; convolutional neural network; CNN modularization; module composition;
D O I
10.1145/3632744
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
With the widespread success of deep learning technologies, many trained deep neural network (DNN) models are now publicly available. However, directly reusing the public DNN models for new tasks often fails due to mismatching functionality or performance. Inspired by the notion of modularization and composition in software reuse, we investigate the possibility of improving the reusability of DNN models in a more fine-grained manner. Specifically, we propose two modularization approaches named CNNSplitter and GradSplitter, which can decompose a trained convolutional neural network (CNN) model for N-class classification into N small reusable modules. Each module recognizes one of the N classes and contains a part of the convolution kernels of the trained CNN model. Then, the resulting modules can be reused to patch existing CNN models or build new CNN models through composition. The main difference between CNNSplitter and GradSplitter lies in their search methods: the former relies on a genetic algorithm to explore search space, while the latter utilizes a gradient-based search method. Our experiments with three representative CNNs on three widely used public datasets demonstrate the effectiveness of the proposed approaches. Compared with CNNSplitter, GradSplitter incurs less accuracy loss, produces much smaller modules (19.88% fewer kernels), and achieves better results on patching weak models. In particular, experiments on GradSplitter show that (1) by patching weak models, the average improvement in terms of precision, recall, and F1-score is 17.13%, 4.95%, and 11.47%, respectively, and (2) for a new task, compared with the models trained from scratch, reusing modules achieves similar accuracy (the average loss of accuracy is only 2.46%) without a costly training process. Our approaches provide a viable solution to the rapid development and improvement of CNN models.
引用
收藏
页数:39
相关论文
共 50 条
  • [21] Deep Convolutional Neural Network in Deformable Part Models for Face Detection
    Dinh-Luan Nguyen
    Vinh-Tiep Nguyen
    Minh-Triet Tran
    Yoshitaka, Atsuo
    IMAGE AND VIDEO TECHNOLOGY, PSIVT 2015, 2016, 9431 : 669 - 681
  • [22] Application of Convolutional Neural Network in B-rep Models Classification
    Li Mengge
    Wang Jihua
    2018 INTERNATIONAL CONFERENCE ON SMART GRID AND ELECTRICAL AUTOMATION (ICSGEA), 2018, : 195 - 198
  • [23] Convolutional neural network-based models for diagnosis of breast cancer
    Mehedi Masud
    Amr E. Eldin Rashed
    M. Shamim Hossain
    Neural Computing and Applications, 2022, 34 : 11383 - 11394
  • [24] Activation Functions and their Impact on the Training and Performance of Convolutional Neural Network Models
    Onwujekwe, Gerald
    Yoon, Victoria
    AMCIS 2020 PROCEEDINGS, 2020,
  • [25] Detecting Calamansi Diseases through their Leaves Using Convolutional Neural Network
    Mariano, Gian Carlo L.
    Briones, Darren M.
    Villaverde, Jocelyn F.
    2024 IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC CONTROL AND INTELLIGENT SYSTEMS, I2CACIS 2024, 2024, : 128 - 133
  • [26] Object recognition through scattering media using convolutional neural network
    Wu, Yulin
    Yan, Huimin
    14TH NATIONAL CONFERENCE ON LASER TECHNOLOGY AND OPTOELECTRONICS (LTO 2019), 2019, 11170
  • [27] Analysis of Pre-trained Convolutional Neural Network Models in Diabetic Retinopathy Detection Through Retinal Fundus Images
    Escorcia-Gutierrez, Jose
    Cuello, Jose
    Barraza, Carlos
    Gamarra, Margarita
    Romero-Aroca, Pere
    Caicedo, Eduardo
    Valls, Aida
    Puig, Domenec
    COMPUTER INFORMATION SYSTEMS AND INDUSTRIAL MANAGEMENT (CISIM 2022), 2022, 13293 : 202 - 213
  • [28] Denoising Convolutional Neural Network
    Xu, Qingyang
    Zhang, Chengjin
    Zhang, Li
    2015 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, 2015, : 1184 - 1187
  • [29] FocusedDropout for Convolutional Neural Network
    Liu, Minghui
    Xie, Tianshu
    Cheng, Xuan
    Deng, Jiali
    Yang, Meiyi
    Wang, Xiaomin
    Liu, Ming
    APPLIED SCIENCES-BASEL, 2022, 12 (15):
  • [30] Dendritic convolutional neural network
    Wang, Rong-Long
    Lei, Zhenyu
    Zhang, Zhiming
    Gao, Shangce
    IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2022, 17 (02) : 302 - 304