Recognizing Personal Locations From Egocentric Videos

被引:28
|
作者
Furnari, Antonino [1 ]
Farinella, Giovanni Maria [1 ]
Battiato, Sebastiano [1 ]
机构
[1] Univ Catania, Dept Math & Comp Sci, I-95124 Catania, Italy
关键词
Context-aware computing; egocentric dataset; egocentric vision; first person vision; personal location recognition; CONTEXT; CLASSIFICATION; RECOGNITION; SCENE; SHAPE;
D O I
10.1109/THMS.2016.2612002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Contextual awareness in wearable computing allows for construction of intelligent systems, which are able to interact with the user in a more natural way. In this paper, we study how personal locations arising from the user's daily activities can be recognized from egocentric videos. We assume that few training samples are available for learning purposes. Considering the diversity of the devices available on the market, we introduce a benchmark dataset containing egocentric videos of eight personal locations acquired by a user with four different wearable cameras. To make our analysis useful in real-world scenarios, we propose a method to reject negative locations, i.e., those not belonging to any of the categories of interest for the end-user. We assess the performances of the main state-of-the-art representations for scene and object classification on the considered task, as well as the influence of device-specific factors such as the field of view and the wearing modality. Concerning the different device-specific factors, experiments revealed that the best results are obtained using a head-mounted wide-angular device. Our analysis shows the effectiveness of using representations based on convolutional neural networks, employing basic transfer learning techniques and an entropy-based rejection algorithm.
引用
收藏
页码:6 / 18
页数:13
相关论文
共 50 条
  • [21] Recognition of Activities of Daily Living from Egocentric Videos Using Hands Detected by a Deep Convolutional Network
    Thi-Hoa-Cuc Nguyen
    Nebel, Jean-Christophe
    Florez-Revuelta, Francisco
    IMAGE ANALYSIS AND RECOGNITION (ICIAR 2018), 2018, 10882 : 390 - 398
  • [22] Selective Keyframe Summarisation for Egocentric Videos Based on Semantic Concept Search
    Yousefi, Paria
    Kuncheva, Ludmila, I
    2018 IEEE THIRD INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, APPLICATIONS AND SYSTEMS (IPAS), 2018, : 19 - 24
  • [23] Recognizing emotions from videos by studying facial expressions, body postures and hand gestures
    Gavrilescu, Mihai
    2015 23RD TELECOMMUNICATIONS FORUM TELFOR (TELFOR), 2015, : 720 - 723
  • [24] Geometrical cues in visual saliency models for active object recognition in egocentric videos
    Vincent Buso
    Jenny Benois-Pineau
    Jean-Philippe Domenger
    Multimedia Tools and Applications, 2015, 74 : 10077 - 10095
  • [25] 3D Layout Propagation to Improve Object Recognition in Egocentric Videos
    Rituerto, Alejandro
    Murillo, Ana C.
    Guerrero, Jose J.
    COMPUTER VISION - ECCV 2014 WORKSHOPS, PT III, 2015, 8927 : 839 - 852
  • [26] Spherical World-Locking for Audio-Visual Localization in Egocentric Videos
    Yun, Heeseung
    Gao, Ruohan
    Ananthabhotla, Ishwarya
    Kumar, Anurag
    Donley, Jacob
    Li, Chao
    Kim, Gunhee
    Ithapu, Vamsi Krishna
    Murdock, Calvin
    COMPUTER VISION - ECCV 2024, PT XXIV, 2025, 15082 : 256 - 274
  • [27] Automated Detection of Hands and Objects in Egocentric Videos, for Ambient Assisted Living Applications
    Thi Hoa Cuc Nguyen
    Nebel, Jean-Christophe
    Hunter, Gordon
    Florez-Revuelta, Francisco
    2018 14TH INTERNATIONAL CONFERENCE ON INTELLIGENT ENVIRONMENTS (IE 2018), 2018, : 91 - 94
  • [28] Unsupervised Gaze Prediction in Egocentric Videos by Energy-based Surprise Modeling
    Aakur, Sathyanarayanan N.
    Bagavathi, Arunkumar
    VISAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 5: VISAPP, 2021, : 935 - 942
  • [29] Geometrical cues in visual saliency models for active object recognition in egocentric videos
    Buso, Vincent
    Benois-Pineau, Jenny
    Domenger, Jean-Philippe
    MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (22) : 10077 - 10095
  • [30] Leveraging Textural Features for Recognizing Actions in Low Quality Videos
    Rahman, Saimunur
    See, John
    Ho, Chiung Ching
    9TH INTERNATIONAL CONFERENCE ON ROBOTIC, VISION, SIGNAL PROCESSING AND POWER APPLICATIONS: EMPOWERING RESEARCH AND INNOVATION, 2017, 398 : 237 - 245