Enhancing Recognition of Human-Object Interaction from Visual Data Using Egocentric Wearable Camera

被引:0
|
作者
Hamid, Danish [1 ]
Ul Haq, Muhammad Ehatisham [1 ]
Yasin, Amanullah [1 ]
Murtaza, Fiza [1 ]
Azam, Muhammad Awais [2 ]
机构
[1] Air Univ, Fac Comp & Artificial Intelligence FCAI, Dept Creat Technol, Islamabad 44000, Pakistan
[2] Whitecliffe, Technol & Innovat Res Grp, Sch Informat Technol, Wellington 6145, New Zealand
关键词
egocentric; hand pose; human-object interaction; machine learning; object recognition; wearable camera;
D O I
10.3390/fi16080269
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Object detection and human action recognition have great significance in many real-world applications. Understanding how a human being interacts with different objects, i.e., human-object interaction, is also crucial in this regard since it enables diverse applications related to security, surveillance, and immersive reality. Thus, this study explored the potential of using a wearable camera for object detection and human-object interaction recognition, which is a key technology for the future Internet and ubiquitous computing. We propose a system that uses an egocentric camera view to recognize objects and human-object interactions by analyzing the wearer's hand pose. Our novel idea leverages the hand joint data of the user, which were extracted from the egocentric camera view, for recognizing different objects and related interactions. Traditional methods for human-object interaction rely on a third-person, i.e., exocentric, camera view by extracting morphological and color/texture-related features, and thus, often fall short when faced with occlusion, camera variations, and background clutter. Moreover, deep learning-based approaches in this regard necessitate substantial data for training, leading to a significant computational overhead. Our proposed approach capitalizes on hand joint data captured from an egocentric perspective, offering a robust solution to the limitations of traditional methods. We propose a machine learning-based innovative technique for feature extraction and description from 3D hand joint data by presenting two distinct approaches: object-dependent and object-independent interaction recognition. The proposed method offered advantages in computational efficiency compared with deep learning methods and was validated using the publicly available HOI4D dataset, where it achieved a best-case average F1-score of 74%. The proposed system paves the way for intuitive human-computer collaboration within the future Internet, enabling applications like seamless object manipulation and natural user interfaces for smart devices, human-robot interactions, virtual reality, and augmented reality.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] Human cortical object recognition from a visual motion flowfield
    Kriegeskorte, N
    Sorger, B
    Naumer, M
    Schwarzbach, J
    van den Boogert, E
    Hussy, W
    Goebel, R
    JOURNAL OF NEUROSCIENCE, 2003, 23 (04) : 1451 - 1463
  • [32] Enhancing Perception with Tactile Object Recognition in Adaptive Grippers for Human-Robot Interaction
    Gandarias, Juan M.
    Gomez-de-Gabriel, Jesus M.
    Garcia-Cerezo, Alfonso J.
    SENSORS, 2018, 18 (03)
  • [33] Three-dimensional feature extraction using local reference frame for detecting human-object interaction
    Rezaei, Mansoureh
    Rezaeian, Mehdi
    JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (04)
  • [34] Object Recognition on Cotton Harvesting Robot Using Human Visual System
    Wang, Yong
    Zhu, Xiaorong
    Jia, Yongxing
    Ji, Changying
    COMPUTER AND COMPUTING TECHNOLOGIES IN AGRICULTURE V, PT I, 2012, 368 : 65 - +
  • [35] Visual object-action recognition: Inferring object affordances from human demonstration
    Kjellstrom, Hedvig
    Romero, Javier
    Kragic, Danica
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2011, 115 (01) : 81 - 90
  • [36] LiDARCapV2: 3D human pose estimation with human-object interaction from LiDAR point clouds
    Zhang, Jingyi
    Mao, Qihong
    Shen, Siqi
    Xu, Lan
    Wang, Cheng
    PATTERN RECOGNITION, 2024, 156
  • [37] Efficient Human Activity Recognition Using Machine Learning and Wearable Sensor Data
    Zhong, Ziwei
    Liu, Bin
    APPLIED SCIENCES-BASEL, 2025, 15 (08):
  • [38] Applying Multivariate Segmentation Methods to Human Activity Recognition From Wearable Sensors' Data
    Li, Kenan
    Habre, Rima
    Deng, Huiyu
    Morrison, John
    Gilliland, Frank D.
    Ambite, Jose Luis
    Stripelis, Dimitris
    Chiang, Yao-Yi
    Lin, Yijun
    Bui, Alex A. T.
    King, Christine
    Hosseini, Anahita
    Van Vliet, Eleanne
    Sarrafzadeh, Majid
    Eckel, Sandrah P.
    JMIR MHEALTH AND UHEALTH, 2019, 7 (02):
  • [39] Semi-automatic Training of an Object Recognition System in Scene Camera Data Using Gaze Tracking and Accelerometers
    Cognolato, Matteo
    Graziani, Mara
    Giordaniello, Francesca
    Saetta, Gianluca
    Bassetto, Franco
    Brugger, Peter
    Caputo, Barbara
    Mueller, Henning
    Atzori, Manfredo
    COMPUTER VISION SYSTEMS, ICVS 2017, 2017, 10528 : 175 - 184
  • [40] Multimodal Human Activity Recognition From Wearable Inertial Sensors Using Machine Learning
    Badawi, Abeer A.
    Al-Kabbany, Ahmad
    Shaban, Heba
    2018 IEEE-EMBS CONFERENCE ON BIOMEDICAL ENGINEERING AND SCIENCES (IECBES), 2018, : 402 - 407