Multimodal and Multi-granularity Graph Convolutional Networks for Elderly Daily Activity Recognition

被引:0
|
作者
Ding J. [1 ]
Shu X.-B. [1 ]
Huang P. [1 ]
Yao Y.-Z. [1 ]
Song Y. [1 ]
机构
[1] School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing
来源
Ruan Jian Xue Bao/Journal of Software | 2023年 / 34卷 / 05期
关键词
elderly activity recognition; graph convolutional network (GCN); multi-granularity; multimodal;
D O I
10.13328/j.cnki.jos.006439
中图分类号
学科分类号
摘要
With the problem of the aging population becomes serious, more attention is payed to the safety of the elderly when they are at home alone. In order to provide early warning, alarm, and report of some dangerous behaviors, several domestic and foreign research institutions are focusing on studying the intelligent monitoring of the daily activities of the elderly in robot-view. For promoting the industrialization of these technologies, this work mainly studies how to automatically recognize the daily activities of the elderly, such as “drinking water”, “washing hands”, “reading a book”, “reading a newspaper”. Through the investigation of the daily activity videos of the elderly, it is found that the semantics of the daily activities of the elderly are obviously fine-grained. For example, the semantics of “drinking water” and “taking medicine” are highly similar, and only a small number of video frames can accurately reflect their category semantics. To effectively address such problem of the elderly behavior recognition, this work proposes a new multimodal multi-granularity graph convolutional network (MM-GCN), by applying the graph convolution network on four modalities, i.e., the skeleton (“point”), bone (“line”), frame (“frame”), and proposal (“segment”), to model the activities of the elderly, and capture the semantics under the four granularities of “point-line-frame-proposal”. Finally, the experiments are conducted to validate the activity recognition performance of the proposed method on ETRI-Activity3D (110000+ videos, 50+ classes), which is the largest daily activities dataset for the elderly. Compared with the state-of-the-art methods, the proposed MM-GCN achieves the highest recognition accuracy. In addition, in order to verify the robustness of MM-GCN for the normal human action recognition tasks, the experiment is also carried out on the benchmark NTU RGB+D, and the results show that MM-GCN is comparable to the SOTA methods. © 2023 Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:2350 / 2364
页数:14
相关论文
共 52 条
  • [1] Lopez-Otin C, Blasco MA, Serrano M, Kroemer G., The hallmarks of aging, Cell, 153, 6, pp. 1194-1217, (2013)
  • [2] Sun ZJ, Xue L, Xu YM, Wang Z., Overview of deep learning, Application Research of Computers, 29, 8, pp. 2806-2810, (2012)
  • [3] Xi XF, Zhou GD., A survey on deep learning for natural language processing, Acta Automatica Sinica, 42, 10, pp. 1445-1465, (2016)
  • [4] Zhang S, Gong YH, Wang JJ., The development of deep convolution neural network and its applications on computer vision, Chinese Journal of Computers, 42, 3, pp. 453-482, (2019)
  • [5] Zhu Y, Zhao JK, Wang YN, Zheng BB., A review of human action recognition based on deep learning, Acta Automatica Sinica, 42, 6, pp. 848-857, (2016)
  • [6] Kidd CD, Orr R, Abowd GD, Atkeson CG, Essa IA, Macintyre B, Mynatt E, Starner TE, Newstetter W., The aware home: A living laboratory for ubiquitous computing research, Proc. of the 1999 Int’l Workshop on Cooperative Buildings. Integrating Information, Organizations, and Architecture, pp. 191-198, (1999)
  • [7] Jang J, Kim D, Park C, Jang M, Lee J, Kim J., ETRI-Activity3D: A large-scale RGB-D dataset for robots to recognize daily activities of the elderly, Proc. of the 2020 IEEE/RSJ Int’l Conf. on Intelligent Robots and Systems, pp. 10990-10997, (2020)
  • [8] Veeriah V, Zhuang NF, Qi GJ., Differential recurrent neural networks for action recognition, Proc. of the 2015 IEEE Int’l Conf. on Computer Vision, pp. 4041-4049, (2015)
  • [9] Bruna J, Zaremba W, Szlam A, LeCun Y., Spectral networks and locally connected networks on graphs, (2014)
  • [10] Yan R, Xie LX, Tang JH, Shu XB, Tian Q., HiGCIN: Hierarchical graph-based cross inference network for group activity recognition, IEEE Trans. on Pattern Analysis and Machine Intelligence, (2020)