An Efficient Algorithm-Hardware Co-Design for Radar-Based Fall Detection With Multi-Branch Convolutions

被引：2

作者：

Ou, Zixuan ^{[1
]}

Yu, Bing ^{[1
]}

Ye, Wenbin ^{[1
]}

机构：

[1] Shenzhen Univ, Coll Elect & Informat Engn, Shenzhen 518060, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS | 2023年 / 70卷 / 04期

关键词：

Fall detection; convolutional neural network; radar signal processing; algorithm-hardware co-design; low power; low cost; NEURAL-NETWORK; CLASSIFICATION; ACCELERATOR; ARRAY;

D O I：

10.1109/TCSI.2022.3232918

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, we propose an efficient algorithm-hardware co-design framework to realize radar-based fall detection with limited resources. We first design a compact neural network model named MB-Net with multi-branch convolutions for feature extraction of radar time series data combined with multi-scale wavelet transform. After that, an FPGA-based neural network (NN) accelerator tailored for the proposed network is designed. The proposed NN accelerator replaces the general multipliers with non-exact multipliers to reduce the hardware cost. For the multi-branch convolution layer, a novel layer computing sequence is introduced to improve the efficiency of the processing element (PE) array and reduce the memory footprint. In addition, the average pooling operation in the proposed network is folded into the quantization factors to reduce hardware cost. The experimental findings show that the MB-Net can maintain competitive performance in comparison to state-of-the-art methods while the hardware cost is significantly lower. The proposed network model is implemented in Zynq ZC702 board using only 3615 LUTs, 1843 FFs, 11.5 BRAMs, and 8 DSPs with 0.234 W power consumption. Through algorithm and hardware co-optimization, the fall detection accelerator can achieve 95 $\%$ PE efficiency and takes 0.346 ms latency for a radar sample interference with only 80.96 uJ energy consumption.

引用

页码：1613 / 1624

页数：12

共 35 条

[1] Ageing and Life Course Unit, 2008, WHO GLOB REP FALLP
[2] Optimizing Hardware Accelerated General Matrix-Matrix Multiplication for CNNs on FPGAs
Ahmad, Afzal
Pasha, Muhammad Adeel
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2020, 67 (11) : 2692 - 2696
[3] [Anonymous], 2010, A Tutorial of the Wavelet Transform
[4] Improving the Accuracy and Hardware Efficiency of Neural Networks Using Approximate Multipliers
Ansari, Mohammad Saeed
Mrazek, Vojtech
Cockburn, Bruce F.
Sekanina, Lukas
Vasicek, Zdenek
Han, Jie
[J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (02) : 317 - 328
[5] A CNN Accelerator on FPGA Using Depthwise Separable Convolution
Bai, Lin
Zhao, Yiming
Huang, Xinming
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2018, 65 (10) : 1415 - 1419
[6] Classification of Human Activity Based on Radar Signal Using 1-D Convolutional Neural Network
Chen, Haiquan
Ye, Wenbin
[J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2020, 17 (07) : 1178 - 1182
[7] A Throughput-Optimized Channel-Oriented Processing Element Array for Convolutional Neural Networks
Chen, Yu-Xian
Ruan, Shanq-Jang
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2021, 68 (02) : 752 - 756
[8] Designing efficient accelerator of depthwise separable convolutional neural network on FPGA
Ding, Wei
Huang, Zeyu
Huang, Zunkai
Tian, Li
Wang, Hui
Feng, Songlin
[J]. JOURNAL OF SYSTEMS ARCHITECTURE, 2019, 97 : 278 - 286
[9] Ding WA, 2015, IEEE INT SYMP CIRC S, P2960, DOI 10.1109/ISCAS.2015.7169308
[10] Erol B, 2017, IEEE RAD CONF, P819, DOI 10.1109/RADAR.2017.7944316

← 1 2 3 4 →