CMNet: a novel model and design rationale based on comparison studies and synergy of CNN and MetaFormer

被引:1
作者
Yu, Haowen [1 ]
Chen, Liming [2 ]
机构
[1] Univ Manchester, Fac Biol Med & Hlth, Oxford Rd, Manchester M13 9PL, England
[2] Univ Ulster, Sch Comp, Cromore Rd, Belfast BT52 1SA, North Ireland
关键词
Transformer; MetaFormer; Attention mechanism; Convolutional neural network;
D O I
10.1007/s00138-023-01446-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional- and Transformer-based backbone architecture are two dominant, widely accepted, models in computer vision. Nevertheless, it is still a challenge, thus a focus of research, to decide which backbone architecture performs better, and under which circumstances. In this paper, we conduct an in-depth investigation into the differences of the macroscopic backbone design of the CNN and Transformer models with the ultimate purpose of developing new models to combine the strengths of both types of architectures for effective image classification. Specifically, we first analyze the model structures of both models and identified four main differences, then we design four sets of ablation experiments using the ImageNet-1K dataset with an image classification problem as an example to study the impacts of these four differences on model performance. Based on the experimental results, we derive four observations as rules of thumb for designing a vision model backbone architecture. Informed by the experiment findings, we then conceive a novel model called CMNet which marries the experiment-proved best design practices of CNN and Transformer architectures. Finally, we carry out extensive experiments on CMNet using the same dataset against baseline classifiers. Initial results prove CMNet achieves the highest top-1 accuracy of 80.08% on the ImageNet-1K validation set, this is a very competitive value compared to previous classical models with similar computational complexity. Details of the implementation, algorithms and codes, are publicly available on Github: https://github.com/Arwin-Yu/CMNet.
引用
收藏
页数:13
相关论文
共 21 条
  • [1] CMNet: a novel model and design rationale based on comparison studies and synergy of CNN and MetaFormer
    Haowen Yu
    Liming Chen
    Machine Vision and Applications, 2023, 34
  • [2] A novel method based on CNN-BiGRU and AM model for bearing fault diagnosis
    Xu, Ziwei
    Li, Yan-Feng
    Huang, Hong-Zhong
    Deng, Zhiming
    Huang, Zixing
    JOURNAL OF MECHANICAL SCIENCE AND TECHNOLOGY, 2024, 38 (07) : 3361 - 3369
  • [3] A Novel CNN-based Model for Medical Image Registration
    Gao, Hui
    Liang, Mingliang
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (11) : 1125 - 1136
  • [4] A Novel Attention Based CNN Model for Emotion Intensity Prediction
    Xie, Hongliang
    Feng, Shi
    Wang, Daling
    Zhang, Yifei
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT I, 2018, 11108 : 365 - 377
  • [5] FCPNet : A novel model to predict forward collision based-upon CNN
    Olou, Herve B.
    Ezin, Eugene C.
    Dembele, Jean Marie
    Cambier, Christophe
    2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, : 1327 - 1332
  • [6] Novel deep learning model for facial expression recognition based on maximum boosted CNN and LSTM
    Rajan, Saranya
    Chenniappan, Poongodi
    Devaraj, Somasundaram
    Madian, Nirmala
    IET IMAGE PROCESSING, 2020, 14 (07) : 1373 - 1381
  • [7] Intelligent fault diagnosis of rolling bearing based on novel CNN model considering data imbalance
    Xing, Ziyang
    Zhao, Rongzhen
    Wu, Yaochun
    He, Tianjing
    APPLIED INTELLIGENCE, 2022, 52 (14) : 16281 - 16293
  • [8] A Novel Channel Pruning Approach based on Local Attention and Global Ranking for CNN Model Compression
    Lu, Wei
    Jiang, Yang
    Jing, Peiguang
    Chu, Jinghui
    Fan, Fugui
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1433 - 1438
  • [9] A Novel Damage Identification Method for Steel Catenary Risers Based on a Novel CNN-GRU Model Optimized by PSO
    Liu, Zhongyan
    Mei, Jiangtao
    Wang, Deguo
    Guo, Yanbao
    Wu, Lei
    JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2023, 11 (01)
  • [10] Automatic Optimal Design Method for Circuit Sizing Based on CNN Surrogate Model Assisted Differential Evolution Algorithm
    Tang, Chaoying
    Chen, Xiaofei
    Luo, Yanshen
    Zeng, Yanhan
    IEEE ACCESS, 2024, 12 : 136238 - 136247