Egocentric Vision for Human Activity Recognition Using Deep Learning

被引：1

作者：

Douache, Malika ^{[1
,2
,3
]}

Benmoussat, Badra Nawal ^{[1
]}

机构：

[1] Univ Sci & Technol Oran Mohamed Boudiaf USTOMB, Automat Vis & Intelligent Syst Control Lab, Oran, Algeria

[2] Natl Inst Telecommun Informat & Commun Technol INT, Oran, Algeria

[3] Natl Higher Sch Telecommun Informat & Commun Techn, Oran, Algeria

来源：

JOURNAL OF INFORMATION PROCESSING SYSTEMS | 2023年 / 19卷 / 06期

基金：

美国国家科学基金会;

关键词：

Convolutional Neural Network; Deep Learning; Egocentric Vision (or First-Person Vision); Human Activity Recognition; Image Classification; Inertial Measurement Unit (IMU);

D O I：

10.3745/JIPS.02.0207

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The topic of this paper is the recognition of human activities using egocentric vision, particularly captured by body-worn cameras, which could be helpful for video surveillance, automatic search and video indexing. This being the case, it could also be helpful in assistance to elderly and frail persons for revolutionizing and improving their lives. The process throws up the task of human activities recognition remaining problematic, because of the important variations, where it is realized through the use of an external device, similar to a robot, as a personal assistant. The inferred information is used both online to assist the person, and offline to support the personal assistant. With our proposed method being robust against the various factors of variability problem in action executions, the major purpose of this paper is to perform an efficient and simple recognition method from egocentric camera data only using convolutional neural network and deep learning. In terms of accuracy improvement, simulation results outperform the current state of the art by a significant margin of 61% when using egocentric camera data only, more than 44% when using egocentric camera and several stationary cameras data and more than 12% when using both inertial measurement unit (IMU) and egocentric camera data.

引用

页码：730 / 744

页数：15

共 36 条

[1] Learning Human Activity From Visual Data Using Deep Learning [J].

Alhersh, Taha ;

Stuckenschmidt, Heiner ;

Rehman, Atiq Ur ;

Belhaouari, Samir Brahim .

IEEE ACCESS, 2021, 9 :106245-106253

[2]

Carnegie Mellon University, 2010, CMU-Multimodal Activity (CMU-MMAC) database

[3] Fusing Object Information and Inertial Data for Activity Recognition [J].

Diete, Alexander ;

Stuckenschmidt, Heiner .

SENSORS, 2019, 19 (19)

[4]

DVDVideoSoft, 2022, Free Video to JPG converter

[5]

Fathi A, 2012, LECT NOTES COMPUT SC, V7572, P314, DOI 10.1007/978-3-642-33718-5_23

[6]

Fathi A, 2011, IEEE I CONF COMP VIS, P407, DOI 10.1109/ICCV.2011.6126269

[7] Rolling-Unrolling LSTMs for Action Anticipation from First-Person Video [J].

Furnari, Antonino ;

Farinella, Giovanni Maria .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (11) :4021-4036

[8]

Ghosh A, 2020, INTEL SYST REF LIBR, V172, P519, DOI 10.1007/978-3-030-32644-9_36

[9] Human Activity Recognition: A Survey [J].

Jobanputra, Charmi ;

Bavishi, Jatna ;

Doshi, Nishant .

16TH INTERNATIONAL CONFERENCE ON MOBILE SYSTEMS AND PERVASIVE COMPUTING (MOBISPC 2019),THE 14TH INTERNATIONAL CONFERENCE ON FUTURE NETWORKS AND COMMUNICATIONS (FNC-2019),THE 9TH INTERNATIONAL CONFERENCE ON SUSTAINABLE ENERGY INFORMATION TECHNOLOGY, 2019, 155 :698-703

[10] EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition [J].

Kazakos, Evangelos ;

Nagrani, Arsha ;

Zisserman, Andrew ;

Damen, Dima .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :5491-5500

← 1 2 3 4 →