An FPGA-Based Lightweight Semantic Segmentation Neural Network With Optimized Ghost Module

被引:1
|
作者
Chen, Yan [1 ]
Jiang, Jie [1 ]
Ma, Yan [1 ]
机构
[1] Minist Educ, Sch Instrumentat & Optoelect Engn, Key Lab Precis Optomechatron Technol, Beijing 100191, Peoples R China
来源
IEEE INTERNET OF THINGS JOURNAL | 2024年 / 11卷 / 13期
基金
中国国家自然科学基金;
关键词
Embedded devices; field programmable gate array (FPGA); optimized ghost module; parallel processing; segmentation neural network; CNN ACCELERATOR;
D O I
10.1109/JIOT.2024.3391248
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Semantic segmentation neural network classifies each pixel of the input image with semantic labels and it is widely used in varying domains, such as remote sensing, autonomous driving, and image analysis. However, neural networks exhibit a high demand for both parallel computing and extensive data processing capabilities, which greatly challenges resource-constrained embedded devices. Therefore, this article designs a field programmable gate array-based semantic segmentation neural network for lightweighting, parallelism, and bandwidth with a strategy of hardware-software co-design. The designed optimized ghost module halves the bandwidth requirement by reordering the internal structure. Group convolution, channel shuffle, and channel attention modules (CAMs) are used to further compress the network and improve accuracy. The segmentation network based on the designed optimized ghost module reduces computational complexity and parameters to 8.2 and 13.5 times, respectively. On the hardware side, an accelerator highly parallel in data input, processing and output is designed. The CAM is divided into two parts: 1) a parameter-intensive section and 2) a computation-intensive section, which are computed in parallel by the ZYNQ's processing system and programmable logic sides, respectively. The accelerator reached 240 MHz frequency in our experiment and the segmentation of a 320 x 320 input took 12.06 ms. It achieves a performance of 198.16 GOPS and consumes only 9.84 W of power for the entire board running at full load, which is suitable for embedded devices with requirements on real time and power consumption.
引用
收藏
页码:24247 / 24258
页数:12
相关论文
共 50 条
  • [1] Runtime Programmable and Memory Bandwidth Optimized FPGA-Based Coprocessor for Deep Convolutional Neural Network
    Shah, Nimish
    Chaudhari, Paragkumar
    Varghese, Kuruvilla
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (12) : 5922 - 5934
  • [2] [DL] A Survey of FPGA-based Neural Network Inference Accelerators
    Guo, Kaiyuan
    Zeng, Shulin
    Yu, Jincheng
    Wang, Yu
    Yang, Huazhong
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2019, 12 (01)
  • [3] FPGA-based implementation of deep neural network using stochastic computing
    Nobari, Maedeh
    Jahanirad, Hadi
    APPLIED SOFT COMPUTING, 2023, 137
  • [4] An Efficient FPGA-Based Dilated and Transposed Convolutional Neural Network Accelerator
    Wu, Tsung-Hsi
    Shu, Chang
    Liu, Tsung-Te
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024, 71 (11) : 5178 - 5186
  • [5] Efficient FPGA-Based Convolutional Neural Network Implementation for Edge Computing
    Cuong, Pham-Quoc
    Thinh, Tran Ngoc
    JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2023, 14 (03) : 479 - 487
  • [6] FPGA-Based Unified Accelerator for Convolutional Neural Network and Vision Transformer
    Li T.
    Zhang F.
    Wang S.
    Cao W.
    Chen L.
    Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2024, 46 (06): : 2663 - 2672
  • [7] An FPGA-Based On-Chip Neural Network for TDLAS Tomography in Dynamic Flames
    Huang, Ang
    Cao, Zhang
    Wang, Chenran
    Wen, Jinting
    Lu, Fanghao
    Xu, Lijun
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70
  • [8] An Efficient FPGA-based Depthwise Separable Convolutional Neural Network Accelerator with Hardware Pruning
    Liu, Zhengyan
    Liu, Qiang
    Yan, Shun
    Cheung, Ray C. C.
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2024, 17 (01)
  • [9] Optimisation of FPGA-Based Designs for Convolutional Neural Networks
    Bonifus, P. L.
    Thomas, Ann Mary
    Antony, Jobin K.
    SMART SENSORS MEASUREMENT AND INSTRUMENTATION, CISCON 2021, 2023, 957 : 209 - 221
  • [10] FPGA-Based Acceleration for Bayesian Convolutional Neural Networks
    Fan, Hongxiang
    Ferianc, Martin
    Que, Zhiqiang
    Liu, Shuanglong
    Niu, Xinyu
    Rodrigues, Miguel R. D.
    Luk, Wayne
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (12) : 5343 - 5356