An FPGA-Based Lightweight Semantic Segmentation Neural Network With Optimized Ghost Module

被引：1

作者：

Chen, Yan ^{[1
]}

Jiang, Jie ^{[1
]}

Ma, Yan ^{[1
]}

机构：

[1] Minist Educ, Sch Instrumentat & Optoelect Engn, Key Lab Precis Optomechatron Technol, Beijing 100191, Peoples R China

来源：

IEEE INTERNET OF THINGS JOURNAL | 2024年 / 11卷 / 13期

基金：

中国国家自然科学基金;

关键词：

Embedded devices; field programmable gate array (FPGA); optimized ghost module; parallel processing; segmentation neural network; CNN ACCELERATOR;

D O I：

10.1109/JIOT.2024.3391248

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Semantic segmentation neural network classifies each pixel of the input image with semantic labels and it is widely used in varying domains, such as remote sensing, autonomous driving, and image analysis. However, neural networks exhibit a high demand for both parallel computing and extensive data processing capabilities, which greatly challenges resource-constrained embedded devices. Therefore, this article designs a field programmable gate array-based semantic segmentation neural network for lightweighting, parallelism, and bandwidth with a strategy of hardware-software co-design. The designed optimized ghost module halves the bandwidth requirement by reordering the internal structure. Group convolution, channel shuffle, and channel attention modules (CAMs) are used to further compress the network and improve accuracy. The segmentation network based on the designed optimized ghost module reduces computational complexity and parameters to 8.2 and 13.5 times, respectively. On the hardware side, an accelerator highly parallel in data input, processing and output is designed. The CAM is divided into two parts: 1) a parameter-intensive section and 2) a computation-intensive section, which are computed in parallel by the ZYNQ's processing system and programmable logic sides, respectively. The accelerator reached 240 MHz frequency in our experiment and the segmentation of a 320 x 320 input took 12.06 ms. It achieves a performance of 198.16 GOPS and consumes only 9.84 W of power for the entire board running at full load, which is suitable for embedded devices with requirements on real time and power consumption.

引用

页码：24247 / 24258

页数：12

共 50 条

[1] Runtime Programmable and Memory Bandwidth Optimized FPGA-Based Coprocessor for Deep Convolutional Neural Network
Shah, Nimish
Chaudhari, Paragkumar
Varghese, Kuruvilla
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (12) : 5922 - 5934
[2] [DL] A Survey of FPGA-based Neural Network Inference Accelerators
Guo, Kaiyuan
Zeng, Shulin
Yu, Jincheng
Wang, Yu
Yang, Huazhong
ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2019, 12 (01)
[3] FPGA-based implementation of deep neural network using stochastic computing
Nobari, Maedeh
Jahanirad, Hadi
APPLIED SOFT COMPUTING, 2023, 137
[4] An Efficient FPGA-Based Dilated and Transposed Convolutional Neural Network Accelerator
Wu, Tsung-Hsi
Shu, Chang
Liu, Tsung-Te
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024, 71 (11) : 5178 - 5186
[5] Efficient FPGA-Based Convolutional Neural Network Implementation for Edge Computing
Cuong, Pham-Quoc
Thinh, Tran Ngoc
JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2023, 14 (03) : 479 - 487
[6] FPGA-Based Unified Accelerator for Convolutional Neural Network and Vision Transformer
Li T.
Zhang F.
Wang S.
Cao W.
Chen L.
Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2024, 46 (06): : 2663 - 2672
[7] An FPGA-Based On-Chip Neural Network for TDLAS Tomography in Dynamic Flames
Huang, Ang
Cao, Zhang
Wang, Chenran
Wen, Jinting
Lu, Fanghao
Xu, Lijun
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70
[8] An Efficient FPGA-based Depthwise Separable Convolutional Neural Network Accelerator with Hardware Pruning
Liu, Zhengyan
Liu, Qiang
Yan, Shun
Cheung, Ray C. C.
ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2024, 17 (01)
[9] Optimisation of FPGA-Based Designs for Convolutional Neural Networks
Bonifus, P. L.
Thomas, Ann Mary
Antony, Jobin K.
SMART SENSORS MEASUREMENT AND INSTRUMENTATION, CISCON 2021, 2023, 957 : 209 - 221
[10] FPGA-Based Acceleration for Bayesian Convolutional Neural Networks
Fan, Hongxiang
Ferianc, Martin
Que, Zhiqiang
Liu, Shuanglong
Niu, Xinyu
Rodrigues, Miguel R. D.
Luk, Wayne
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (12) : 5343 - 5356

← 1 2 3 4 5 →