Incremental human action recognition with dual memory

被引:4
作者
Gutoski, Matheus [1 ]
Lazzaretti, Andre Eugenio [1 ]
Lopes, Heitor Silverio [1 ]
机构
[1] Univ Tecnol Fed Parana, Av Sete Setembro 3165, BR-80230901 Curitiba, Parana, Brazil
关键词
Incremental learning; Human Action Recognition; Metric Learning; Triplet Networks; Dual-memory Extreme Value Machine;
D O I
10.1016/j.imavis.2021.104313
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Incremental learning is a topic of great interest in the current state of machine learning research. Real-world problems often require a classifier to incorporate new knowledge while preserving what was learned before. One of the most challenging problems in computer vision is Human Action Recognition (HAR) in videos. How-ever, most of the existing works approach HAR from a non-incremental point of view. This work proposes a framework for performing HAR in the incremental learning scenario called Incremental Human Action Recogni-tion with Dual Memory (IHAR-DM). IHAR-DM contains three main components: a 3D convolutional neural net-work for capturing Spatio-temporal features; a Triplet Network to perform metric learning; and the dual -memory Extreme Value Machine, which is introduced in this work. The proposed method is compared with 10 other state-of-the-art incremental learning models. We propose five experimental settings containing different numbers of tasks and classes using two widely known HAR datasets: UCF-101 and HMDB51. Our results show superior performance in terms of Normalized Mutual Information (NMI) and Inter-task Intransigence (ITI), which is a new metric proposed in this work. Overall results show the feasibility of the proposal for real HAR problems, which mostly present the requirements imposed by incremental learning. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:13
相关论文
共 44 条
[41]   Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification [J].
Xie, Saining ;
Sun, Chen ;
Huang, Jonathan ;
Tu, Zhuowen ;
Murphy, Kevin .
COMPUTER VISION - ECCV 2018, PT 15, 2018, 11219 :318-335
[42]  
Yuanyuan Shi, 2018, 2018 IEEE Power & Energy Society General Meeting (PESGM), DOI 10.1109/PESGM.2018.8586227
[43]  
Zenke F, 2017, PR MACH LEARN RES, V70
[44]   A Comprehensive Survey of Vision-Based Human Action Recognition Methods [J].
Zhang, Hong-Bo ;
Zhang, Yi-Xiang ;
Zhong, Bineng ;
Lei, Qing ;
Yang, Lijie ;
Du, Ji-Xiang ;
Chen, Duan-Sheng .
SENSORS, 2019, 19 (05)