Automated efficient traffic gesture recognition using swin transformer-based multi-input deep network with radar images

被引：0

作者：

Firat, Huseyin ^{[1
]}

Uzen, Huseyin ^{[2
]}

Atila, Orhan ^{[3
]}

Sengur, Abdulkadir ^{[4
]}

机构：

[1] Dicle Univ, Fac Engn, Dept Comp Engn, Diyarbakir, Turkiye

[2] Bingol Univ, Fac Engn & Architecture, Dept Comp Engn, Bingol, Turkiye

[3] Firat Univ, Technol Fac, Elect Elect Engn Dept, Elazig, Turkiye

[4] Firat Univ, Fac Technol, Dept Elect & Elect Engn, Elazig, Turkiye

来源：

SIGNAL IMAGE AND VIDEO PROCESSING | 2025年 / 19卷 / 01期

关键词：

Deep learning; Radar images; Swin transformers; Traffic hand gesture;

D O I：

10.1007/s11760-024-03664-6

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Radar-based artificial intelligence (AI) applications have gained significant attention recently, spanning from fall detection to gesture recognition. The growing interest in this field has led to a shift towards deep convolutional networks, and transformers have emerged to address limitations in convolutional neural network methods, becoming increasingly popular in the AI community. In this paper, we present a novel hybrid approach for radar-based traffic hand gesture classification using transformers. Traffic hand gesture recognition (HGR) holds importance in AI applications, and our proposed three-phase approach addresses the efficiency and effectiveness of traffic HGR. In the initial phase, feature vectors are extracted from input radar images using the pre-trained DenseNet-121 model. These features are then consolidated by concatenating them to gather information from diverse radar sensors, followed by a patch extraction operation. The concatenated features from all inputs are processed in the Swin transformer block to facilitate further HGR. The classification stage involves sequential application of global average pooling, Dense, and Softmax layers. To assess the effectiveness of our method on ULM university radar dataset, we employ various performance metrics, including accuracy, precision, recall, and F1-score, achieving an average accuracy score of 90.54%. We compare this score with existing approaches to demonstrate the competitiveness of our proposed method.

引用

页数：11

共 50 条

[1] Malware Detection for Portable Executables Using a Multi-input Transformer-based Approach
Huoh, Ting-Li
Miskell, Timothy
Barut, Onur
Luo, Yan
Li, Peilong
Zhang, Tong
2024 INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKING AND COMMUNICATIONS, ICNC, 2024, : 778 - 782
[2] Multi-Input Deep Learning Based FMCW Radar Signal Classification
Cha, Daewoong
Jeong, Sohee
Yoo, Minwoo
Oh, Jiyong
Han, Dongseog
ELECTRONICS, 2021, 10 (10)
[3] Transformer-based deep reverse attention network for multi-sensory human activity recognition
Pramanik, Rishav
Sikdar, Ritodeep
Sarkar, Ram
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 122
[4] SketchFormer: transformer-based approach for sketch recognition using vector images
Anil Singh Parihar
Gaurav Jain
Shivang Chopra
Suransh Chopra
Multimedia Tools and Applications, 2021, 80 : 9075 - 9091
[5] SketchFormer: transformer-based approach for sketch recognition using vector images
Parihar, Anil Singh
Jain, Gaurav
Chopra, Shivang
Chopra, Suransh
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (06) : 9075 - 9091
[6] Boosting multi-target recognition performance with multi-input multi-output radar-based angular subspace projection and multi-view deep neural network
Kurtoglu, Emre
Biswas, Sabyasachi
Gurbuz, Ali C.
Gurbuz, Sevgi Zubeyde
IET RADAR SONAR AND NAVIGATION, 2023, 17 (07) : 1115 - 1128
[7] CSTSUNet: A Cross Swin Transformer-Based Siamese U-Shape Network for Change Detection in Remote Sensing Images
Wu, Yaping
Li, Lu
Wang, Nan
Li, Wei
Fan, Junfang
Tao, Ran
Wen, Xin
Wang, Yanfeng
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[8] Transformer-based multi-source images instance segmentation network for composite materials
Ke Y.
Fu Y.
Zhou W.
Zhu W.
Hongwai yu Jiguang Gongcheng/Infrared and Laser Engineering, 2023, 52 (02):
[9] Multi-Stream Single Network: Efficient Compressed Video Action Recognition With a Single Multi-Input Multi-Output Network
Terao, Hayato
Noguchi, Wataru
Iizuka, Hiroyuki
Yamamoto, Masahito
IEEE ACCESS, 2024, 12 : 20983 - 20997
[10] M-Swin: Transformer-Based Multiscale Feature Fusion Change Detection Network Within Cropland for Remote Sensing Images
Pan, Jun
Bai, Yuchuan
Shu, Qidi
Zhang, Zhuoer
Hu, Jiarui
Wang, Mi
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 16

← 1 2 3 4 5 →