A Deep Reinforcement Learning Method For Multimodal Data Fusion in Action Recognition

被引:18
作者
Guo, Jiale [1 ]
Liu, Qiang [1 ]
Chen, Enqing [1 ]
机构
[1] Zhengzhou Univ, Sch Informat Engn, Zhengzhou 450001, Peoples R China
关键词
Reinforcement learning; Resource management; Data models; Signal processing algorithms; Task analysis; Neural networks; Decision making; Multimodal action recognition; TD3; fusion weight allocation; deep reinforcement learning;
D O I
10.1109/LSP.2021.3128379
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
At present, in the research of multimodal human action recognition, the weighted fusion method with fixed weight is widely applied in the decision level fusion of most models. In this way, the weight is usually obtained from the original experience or traversal search, which is inaccurate or has a large amount of calculation, and ignores the different representation ability of various modal data for various classes of action information. With the help of the powerful decision-making ability of deep reinforcement learning, we propose a multimodal decision-making fusion weight allocation network based on deep reinforcement learning. This letter mainly discusses the design of the model, which involves the modeling of reinforcement learning problem in action recognition, the design of neural network and the selection of problem-solving scheme. Experimental results on NTU RGB + D and HMDB51 datasets show the effectiveness of the proposed method.
引用
收藏
页码:120 / 124
页数:5
相关论文
共 26 条
  • [1] NEURONLIKE ADAPTIVE ELEMENTS THAT CAN SOLVE DIFFICULT LEARNING CONTROL-PROBLEMS
    BARTO, AG
    SUTTON, RS
    ANDERSON, CW
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1983, 13 (05): : 834 - 846
  • [2] Learning Spatiotemporal Features with 3D Convolutional Networks
    Du Tran
    Bourdev, Lubomir
    Fergus, Rob
    Torresani, Lorenzo
    Paluri, Manohar
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4489 - 4497
  • [3] Fujimoto S, 2018, PR MACH LEARN RES, V80
  • [4] Gao Z, 2016, 2016 IEEE INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), P676, DOI 10.1109/ICDSP.2016.7868644
  • [5] Jointly Learning Heterogeneous Features for RGB-D Activity Recognition
    Hu, Jian-Fang
    Zheng, Wei-Shi
    Lai, Jianhuang
    Zhang, Jianguo
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (11) : 2186 - 2200
  • [6] Kong Y, 2015, PROC CVPR IEEE, P1054, DOI 10.1109/CVPR.2015.7298708
  • [7] Kuehne H, 2011, IEEE I CONF COMP VIS, P2556, DOI 10.1109/ICCV.2011.6126543
  • [8] Actional-Structural Graph Convolutional Networks for Skeleton-based Action Recognition
    Li, Maosen
    Chen, Siheng
    Chen, Xu
    Zhang, Ya
    Wang, Yanfeng
    Tian, Qi
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3590 - 3598
  • [9] Skeleton-Based Action Recognition Using Multi-Scale and Multi-Stream Improved Graph Convolutional Network
    Li, Wang
    Liu, Xu
    Liu, Zheng
    Du, Feixiang
    Zou, Qiang
    [J]. IEEE ACCESS, 2020, 8 (08): : 144529 - 144542
  • [10] Reinforcement learning improves behaviour from evaluative feedback
    Littman, Michael L.
    [J]. NATURE, 2015, 521 (7553) : 445 - 451