Ultra-Range Gesture Recognition using a web-camera in Human-Robot Interaction

被引:9
作者
Bamani, Eran [1 ]
Nissinman, Eden [1 ]
Meir, Inbar [1 ]
Koenigsberg, Lisa [1 ]
Sintov, Avishai [1 ]
机构
[1] Tel Aviv Univ, Sch Mech Engn, Haim Levanon St, IL-6997801 Tel Aviv, Israel
基金
芬兰科学院;
关键词
Human-Robot Interaction; Ultra-Range Gesture Recognition; Graph Convolutional Network; Vision Transformer; HAND; COMMUNICATION;
D O I
10.1016/j.engappai.2024.108443
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hand gestures play a significant role in human interactions where non-verbal intentions, thoughts and commands are conveyed. In Human-Robot Interaction (HRI), hand gestures offer a similar and efficient medium for conveying clear and rapid directives to a robotic agent. However, state-of-the-art vision-based methods for gesture recognition have been shown to be effective only up to a user-camera distance of seven meters. Such a short distance range limits practical HRI with, for example, service robots, search and rescue robots and drones. In this work, we address the Ultra-Range Gesture Recognition (URGR) problem by aiming for a recognition distance of up to 25 m and in the context of HRI. We propose the URGR framework, a novel deep-learning, using solely a simple RGB camera. Gesture inference is based on a single image. First, a novel super-resolution model termed High-Quality Network (HQ-Net) uses a set of self-attention and convolutional layers to enhance the low-resolution image of the user. Then, we propose a novel URGR classifier termed Graph Vision Transformer (GViT) which takes the enhanced image as input. GViT combines the benefits of a Graph Convolutional Network (GCN) and a modified Vision Transformer (ViT). Evaluation of the proposed framework over diverse test data yields a high recognition rate of 98.1%. The framework has also exhibited superior performance compared to human recognition in ultra-range distances. With the framework, we analyze and demonstrate the performance of an autonomous quadruped robot directed by human gestures in complex ultra-range indoor and outdoor environments, acquiring 96% recognition rate on average.
引用
收藏
页数:19
相关论文
共 84 条
[1]   Angle based hand gesture recognition using graph convolutional network [J].
Aiman, Umme ;
Ahmad, Tanvir .
COMPUTER ANIMATION AND VIRTUAL WORLDS, 2024, 35 (01)
[2]   Hand Gesture Recognition for Sign Language Using 3DCNN [J].
Al-Hammadi, Muneer ;
Muhammad, Ghulam ;
Abdul, Wadood ;
Alsulaiman, Mansour ;
Bencherif, Mohamed A. ;
Mekhtiche, Mohamed Amine .
IEEE ACCESS, 2020, 8 :79491-79509
[3]   Unified learning approach for egocentric hand gesture recognition and fingertip detection [J].
Alam, Mohammad Mahmudul ;
Islam, Mohammad Tariqul ;
Rahman, S. M. Mahbubur .
PATTERN RECOGNITION, 2022, 121
[4]   Deep-Learning-Based Character Recognition from Handwriting Motion Data Captured Using IMU and Force Sensors [J].
Alemayoh, Tsige Tadesse ;
Shintani, Masaaki ;
Lee, Jae Hoon ;
Okamoto, Shingo .
SENSORS, 2022, 22 (20)
[5]   FastHand: Fast monocular hand pose estimation on embedded systems [J].
An, Shan ;
Zhang, Xiajie ;
Wei, Dong ;
Zhu, Haogang ;
Yang, Jianyu ;
Tsintotas, Konstantinos A. .
JOURNAL OF SYSTEMS ARCHITECTURE, 2022, 122
[6]   A Deep Journey into Super-resolution: A Survey [J].
Anwar, Saeed ;
Khan, Salman ;
Barnes, Nick .
ACM COMPUTING SURVEYS, 2020, 53 (03)
[7]   Superpixel Image Classification with Graph Convolutional Neural Networks Based on Learnable Positional Embedding [J].
Bae, Ji-Hun ;
Yu, Gwang-Hyun ;
Lee, Ju-Hwan ;
Vu, Dang Thanh ;
Anh, Le Hoang ;
Kim, Hyoung-Gook ;
Kim, Jin-Young .
APPLIED SCIENCES-BASEL, 2022, 12 (18)
[8]  
Bamani E, 2023, Arxiv, DOI arXiv:2307.02949
[9]   Analysis of the Hands in Egocentric Vision: A Survey [J].
Bandini, Andrea ;
Zariffa, Jose .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (06) :6846-6866
[10]  
Benalcazar Marco E., 2017, 2017 IEEE 2 ECUADOR