Hybrid Deep Model for Human Behavior Understanding on Industrial Internet of Video Things

被引:17
作者
Dai, Cheng [1 ]
Liu, Xingang [1 ]
Xu, Hao [1 ]
Yang, Laurence Tianruo [2 ]
Deen, M. Jamal [3 ]
机构
[1] Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 610054, Peoples R China
[2] St Francis Xavier Univ, Dept Comp Sci, Antigonish, NS B2G 2W5, Canada
[3] McMaster Univ, Dept Elect Engn & Comp Sci, Hamilton, ON L8S 4K1, Canada
基金
中国国家自然科学基金;
关键词
Tensors; Feature extraction; Matrix decomposition; Training; Deep learning; Machine learning algorithms; Computational modeling; human behavior understanding; Industrial Internet of Video Things (IIoVT); tensor-train;
D O I
10.1109/TII.2021.3058276
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Human behavior understanding is playing more and more important role in human-centered Industrial Internet of Video Things (IIoVT) system with the deep combination of artificial intelligence and video-based industrial Internet of Things. However, it requires expensively computational resources, including high-performance computing units and large memory, to train a deep computation model with a large number of parameters, which limits its effectiveness and efficiency for IIoVT applications. In this article, a tensor-train mechanism based deep model is presented for video human behavior understanding to meet the requirement of IIoVT applications. It can get competitive performance in accuracy and training efficiency with potentiality for combination of artificial intelligence and prefront IIoVT system. On the one hand, to achieve desirable accuracy, we improved the conventional CNN and adopted the recurrent neural network mechanism to enhance the video representation over time, which takes the correlation between consecutive deep feature into consideration. On the other hand, to enhance the inference capacity between the spatial and temporal features, we carry out the self-critical reinforcement learning mechanism in parameter learning stage. Meanwhile, to further reduce parameter storage size to meet requirement for the deployment of deep neural network and edge device, the tensor-train mechanism is used, which transforms the parameter matrix to a tensor space and carry out tensor decomposition mechanism to decrease the number of parameter generated in parameter training. Finally, we conduct extensive experiments to evaluate our scheme, and the results demonstrate that our method can improve the training efficiency and save the memory space for the deep computation model with better accuracy.
引用
收藏
页码:7000 / 7008
页数:9
相关论文
共 24 条
  • [1] LONG-RANGE COMMUNICATIONS IN UNLICENSED BANDS: THE RISING STARS IN THE IOT AND SMART CITY SCENARIOS
    Centenaro, Marco
    Vangelista, Lorenzo
    Zanella, Andrea
    Zorzi, Michele
    [J]. IEEE WIRELESS COMMUNICATIONS, 2016, 23 (05) : 60 - 67
  • [2] Chen X, 2015, PROC IEEE C COMPUT V, P4356
  • [3] Human action recognition using two-stream attention based LSTM networks
    Dai, Cheng
    Liu, Xingang
    Lai, Jinfeng
    [J]. APPLIED SOFT COMPUTING, 2020, 86
  • [4] Stacked Convolutional Denoising Auto-Encoders for Feature Representation
    Du, Bo
    Xiong, Wei
    Wu, Jia
    Zhang, Lefei
    Zhang, Liangpei
    Tao, Dacheng
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (04) : 1017 - 1027
  • [5] Video Captioning With Attention-Based LSTM and Semantic Consistency
    Gao, Lianli
    Guo, Zhao
    Zhang, Hanwang
    Xu, Xing
    Shen, Heng Tao
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (09) : 2045 - 2055
  • [6] Trends in IoT based solutions for health care: Moving AI to the edge
    Greco, Luca
    Percannella, Gennaro
    Ritrovato, Pierluigi
    Tortorella, Francesco
    Vento, Mario
    [J]. PATTERN RECOGNITION LETTERS, 2020, 135 : 346 - 353
  • [7] Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?
    Hara, Kensho
    Kataoka, Hirokatsu
    Satoh, Yutaka
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6546 - 6555
  • [8] Deep neural networks compression learning based on multiobjective evolutionary algorithms
    Huang, Junhao
    Sun, Weize
    Huang, Lei
    [J]. NEUROCOMPUTING, 2020, 378 (378) : 260 - 269
  • [9] Long Activity Video Understanding Using Functional Object-Oriented Network
    Jelodar, Ahmad Babaeian
    Paulius, David
    Sun, Yu
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (07) : 1813 - 1824
  • [10] Deep network compression based on partial least squares
    Jordao, Artur
    Yamada, Fernando
    Schwartz, William Robson
    [J]. NEUROCOMPUTING, 2020, 406 : 234 - 243