Recognition of Human Interactions in Still Images using AdaptiveDRNet with Multi-level Attention

被引:4
作者
Dey, Arnab [1 ]
Biswas, Samit [1 ]
Le, Dac-Nhuong [2 ]
机构
[1] Indian Inst Engn Sci & Technol, Comp Sci & Technol, Howrah 711103, India
[2] Haiphong Univ, Fac Informat Technol, Haiphong 180000, Vietnam
基金
英国科研创新办公室;
关键词
Human interaction recognition; still images; adap-tiveDRNet; multi level attention; human interactions;
D O I
10.14569/IJACSA.2023.01410103
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Interaction Recognition (H2HIR) is a multidisciplinary field that combines computer vision, deep learning, and psychology. Its primary objective is to decode and understand the intricacies of human-human interactions. H2HIR holds significant importance across various domains as it enables machines to perceive, comprehend, and respond to human social behaviors, gestures, and communication patterns. This study aims to identify human-human interactions from just one frame, i.e. from an image. Diverging from the realm of video-based inter-action recognition, a well-established research domain that relies on the utilization of spatio-temporal information, the complexity of the task escalates significantly when dealing with still images due to the absence of these intrinsic spatio-temporal features. This research introduces a novel deep learning model called AdaptiveDRNet with Multi-level Attention to recognize Human -Human (H2H) interactions. Our proposed method demonstrates outstanding performance on the Human-Human Interaction Im-age dataset (H2HID), encompassing 4049 meticulously curated images representing fifteen distinct human interactions and on the publicly accessible HII and HIIv2 related benchmark datasets. Notably, our proposed model excels with a validation accuracy of 97.20% in the classification of human-human interaction images, surpassing the performance of EfficientNet, InceptionResNetV2, NASNet Mobile, ConvXNet, ResNet50, and VGG-16 models. H2H interaction recognition's significance lies in its capacity to enhance communication, improve decision-making, and ultimately contribute to the well-being and efficiency of individuals and society as a whole.
引用
收藏
页码:984 / 994
页数:11
相关论文
共 33 条
[1]   A dataset for Wi-Fi-based human-to-human interaction recognition [J].
Alazrai, Rami ;
Awad, Ali ;
Alsaify, Baha'A. ;
Hababeh, Mohammad ;
Daoud, Mohammad I. .
DATA IN BRIEF, 2020, 31
[2]   Compositional interaction descriptor for human interaction recognition [J].
Cho, Nam-Gyu ;
Park, Se-Ho ;
Park, Jeong-Seon ;
Park, Unsang ;
Lee, Seong-Whan .
NEUROCOMPUTING, 2017, 267 :169-181
[3]  
Dey Arnab, 2023, 2023 3rd International Conference on Intelligent Technologies (CONIT), P1, DOI 10.1109/CONIT59222.2023.10205926
[4]  
Gong WJ, 2012, LECT NOTES COMPUT SC, V7378, P204, DOI 10.1007/978-3-642-31567-1_20
[5]   Interpersonal relation recognition: a survey [J].
Guerdelli, Hajer ;
Ferrari, Claudio ;
Berretti, Stefano .
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (08) :11417-11439
[6]   Ensembled Transfer Learning Based Multichannel Attention Networks for Human Activity Recognition in Still Images [J].
Hirooka, Koki ;
Hasan, Md. Al Mehedi ;
Shin, Jungpil ;
Srizon, Azmain Yakin .
IEEE ACCESS, 2022, 10 :47051-47062
[7]   Algebraic Comparison of Partial Lists in Bioinformatics [J].
Jurman, Giuseppe ;
Riccadonna, Samantha ;
Visintainer, Roberto ;
Furlanello, Cesare .
PLOS ONE, 2012, 7 (05)
[8]   Deep learning and RGB-D based human action, human-human and human-object interaction recognition: A survey? [J].
Khaire, Pushpajit ;
Kumar, Praveen .
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 86
[9]  
Li JH, 2017, I C SERV SYST SERV M
[10]   Large-Margin Regularized Softmax Cross-Entropy Loss [J].
Li, Xiaoxu ;
Chang, Dongliang ;
Tian, Tao ;
Cao, Jie .
IEEE ACCESS, 2019, 7 :19572-19578