Recognizing Personal Locations From Egocentric Videos

被引：28

作者：

Furnari, Antonino ^{[1
]}

Farinella, Giovanni Maria ^{[1
]}

Battiato, Sebastiano ^{[1
]}

机构：

[1] Univ Catania, Dept Math & Comp Sci, I-95124 Catania, Italy

来源：

IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS | 2017年 / 47卷 / 01期

关键词：

Context-aware computing; egocentric dataset; egocentric vision; first person vision; personal location recognition; CONTEXT; CLASSIFICATION; RECOGNITION; SCENE; SHAPE;

D O I：

10.1109/THMS.2016.2612002

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Contextual awareness in wearable computing allows for construction of intelligent systems, which are able to interact with the user in a more natural way. In this paper, we study how personal locations arising from the user's daily activities can be recognized from egocentric videos. We assume that few training samples are available for learning purposes. Considering the diversity of the devices available on the market, we introduce a benchmark dataset containing egocentric videos of eight personal locations acquired by a user with four different wearable cameras. To make our analysis useful in real-world scenarios, we propose a method to reject negative locations, i.e., those not belonging to any of the categories of interest for the end-user. We assess the performances of the main state-of-the-art representations for scene and object classification on the considered task, as well as the influence of device-specific factors such as the field of view and the wearing modality. Concerning the different device-specific factors, experiments revealed that the best results are obtained using a head-mounted wide-angular device. Our analysis shows the effectiveness of using representations based on convolutional neural networks, employing basic transfer learning techniques and an entropy-based rejection algorithm.

引用

页码：6 / 18

页数：13

共 50 条

[21] Recognition of Activities of Daily Living from Egocentric Videos Using Hands Detected by a Deep Convolutional Network
Thi-Hoa-Cuc Nguyen
Nebel, Jean-Christophe
Florez-Revuelta, Francisco
IMAGE ANALYSIS AND RECOGNITION (ICIAR 2018), 2018, 10882 : 390 - 398
[22] Selective Keyframe Summarisation for Egocentric Videos Based on Semantic Concept Search
Yousefi, Paria
Kuncheva, Ludmila, I
2018 IEEE THIRD INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, APPLICATIONS AND SYSTEMS (IPAS), 2018, : 19 - 24
[23] Recognizing emotions from videos by studying facial expressions, body postures and hand gestures
Gavrilescu, Mihai
2015 23RD TELECOMMUNICATIONS FORUM TELFOR (TELFOR), 2015, : 720 - 723
[24] Geometrical cues in visual saliency models for active object recognition in egocentric videos
Vincent Buso
Jenny Benois-Pineau
Jean-Philippe Domenger
Multimedia Tools and Applications, 2015, 74 : 10077 - 10095
[25] 3D Layout Propagation to Improve Object Recognition in Egocentric Videos
Rituerto, Alejandro
Murillo, Ana C.
Guerrero, Jose J.
COMPUTER VISION - ECCV 2014 WORKSHOPS, PT III, 2015, 8927 : 839 - 852
[26] Spherical World-Locking for Audio-Visual Localization in Egocentric Videos
Yun, Heeseung
Gao, Ruohan
Ananthabhotla, Ishwarya
Kumar, Anurag
Donley, Jacob
Li, Chao
Kim, Gunhee
Ithapu, Vamsi Krishna
Murdock, Calvin
COMPUTER VISION - ECCV 2024, PT XXIV, 2025, 15082 : 256 - 274
[27] Automated Detection of Hands and Objects in Egocentric Videos, for Ambient Assisted Living Applications
Thi Hoa Cuc Nguyen
Nebel, Jean-Christophe
Hunter, Gordon
Florez-Revuelta, Francisco
2018 14TH INTERNATIONAL CONFERENCE ON INTELLIGENT ENVIRONMENTS (IE 2018), 2018, : 91 - 94
[28] Unsupervised Gaze Prediction in Egocentric Videos by Energy-based Surprise Modeling
Aakur, Sathyanarayanan N.
Bagavathi, Arunkumar
VISAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 5: VISAPP, 2021, : 935 - 942
[29] Geometrical cues in visual saliency models for active object recognition in egocentric videos
Buso, Vincent
Benois-Pineau, Jenny
Domenger, Jean-Philippe
MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (22) : 10077 - 10095
[30] Leveraging Textural Features for Recognizing Actions in Low Quality Videos
Rahman, Saimunur
See, John
Ho, Chiung Ching
9TH INTERNATIONAL CONFERENCE ON ROBOTIC, VISION, SIGNAL PROCESSING AND POWER APPLICATIONS: EMPOWERING RESEARCH AND INNOVATION, 2017, 398 : 237 - 245

← 1 2 3 4 5 →