A Multimodal Fusion Approach for Human Activity Recognition

被引:10
作者
Koutrintzes, Dimitrios [1 ]
Spyrou, Evaggelos [2 ]
Mathe, Eirini [3 ]
Mylonas, Phivos [3 ]
机构
[1] Natl Ctr Sci Res Demokritos, Inst Informat & Telecommun, Athens, Greece
[2] Univ Thessaly, Dept Informat & Telecommun, Lamia, Greece
[3] Ionian Univ, Dept Informat, Corfu, Greece
关键词
Human activity recognition; multimodal fusion; deep convolutional neural networks;
D O I
10.1142/S0129065723500028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The problem of human activity recognition (HAR) has been increasingly attracting the efforts of the research community, having several applications. It consists of recognizing human motion and/or behavior within a given image or a video sequence, using as input raw sensor measurements. In this paper, a multimodal approach addressing the task of video-based HAR is proposed. It is based on 3D visual data that are collected using an RGB+depth camera, resulting to both raw video and 3D skeletal sequences. These data are transformed into six different 2D image representations; four of them are in the spectral domain, another is a pseudo-colored image. The aforementioned representations are based on skeletal data. The last representation is a "dynamic " image which is actually an artificially created image that summarizes RGB data of the whole video sequence, in a visually comprehensible way. In order to classify a given activity video, first, all the aforementioned 2D images are extracted and then six trained convolutional neural networks are used so as to extract visual features. The latter are fused so as to form a single feature vector and are fed into a support vector machine for classification into human activities. For evaluation purposes, a challenging motion activity recognition dataset is used, while single-view, cross-view and cross-subject experiments are performed. Moreover, the proposed approach is compared to three other state-of-the-art methods, demonstrating superior performance in most experiments.
引用
收藏
页数:20
相关论文
共 96 条
  • [1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [2] A dynamic ensemble learning algorithm for neural networks
    Alam, Kazi Md Rokibul
    Siddique, Nazmul
    Adeli, Hojjat
    [J]. NEURAL COMPUTING & APPLICATIONS, 2020, 32 (12) : 8675 - 8690
  • [3] Fusion of Skeletal and Silhouette-based Features for Human Action Recognition with RGB-D Devices
    Andre Chaaraoui, Alexandros
    Ramon Padilla-Lopez, Jose
    Florez-Revuelta, Francisco
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2013, : 91 - 97
  • [4] Human Silhouette and Skeleton Video Synthesis Through Wi-Fi Signals
    Avola, Danilo
    Cascio, Marco
    Cinque, Luigi
    Fagioli, Alessio
    Foresti, Gian Luca
    [J]. INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2022, 32 (05)
  • [5] 2-D Skeleton-Based Action Recognition via Two-Branch Stacked LSTM-RNNs
    Avola, Danilo
    Cascio, Marco
    Cinque, Luigi
    Foresti, Gian Luca
    Massaroni, Cristiano
    Rodola, Emanuele
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (10) : 2481 - 2496
  • [6] Bilen Hakan, 2016, P IEEE C COMPUTER VI, DOI [DOI 10.1109/CVPR.2016.331, 10.1109/CVPR.2016.331]
  • [7] SkeleMotion: A New Representation of Skeleton Joint Sequences Based on Motion Information for 3D Action Recognition
    Caetano, Carlos
    Sena, Jessica
    Bremond, Francois
    dos Santos, Jefersson A.
    Schwartz, William Robson
    [J]. 2019 16TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS), 2019,
  • [8] Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
    Cao, Zhe
    Simon, Tomas
    Wei, Shih-En
    Sheikh, Yaser
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1302 - 1310
  • [9] ConvNets-based action recognition from skeleton motion maps
    Chen, Yanfang
    Wang, Liwei
    Li, Chuankun
    Hou, Yonghong
    Li, Wanqing
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (3-4) : 1707 - 1725
  • [10] Chen YH, 2017, AIP CONF PROC, V1812, DOI [10.1063/1.4975898, 10.1109/ICCV.2017.137]