Hybrid Deep Model for Human Behavior Understanding on Industrial Internet of Video Things

被引：17

作者：

Dai, Cheng ^{[1
]}

Liu, Xingang ^{[1
]}

Xu, Hao ^{[1
]}

Yang, Laurence Tianruo ^{[2
]}

Deen, M. Jamal ^{[3
]}

机构：

[1] Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 610054, Peoples R China

[2] St Francis Xavier Univ, Dept Comp Sci, Antigonish, NS B2G 2W5, Canada

[3] McMaster Univ, Dept Elect Engn & Comp Sci, Hamilton, ON L8S 4K1, Canada

来源：

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS | 2022年 / 18卷 / 10期

基金：

中国国家自然科学基金;

关键词：

Tensors; Feature extraction; Matrix decomposition; Training; Deep learning; Machine learning algorithms; Computational modeling; human behavior understanding; Industrial Internet of Video Things (IIoVT); tensor-train;

D O I：

10.1109/TII.2021.3058276

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Human behavior understanding is playing more and more important role in human-centered Industrial Internet of Video Things (IIoVT) system with the deep combination of artificial intelligence and video-based industrial Internet of Things. However, it requires expensively computational resources, including high-performance computing units and large memory, to train a deep computation model with a large number of parameters, which limits its effectiveness and efficiency for IIoVT applications. In this article, a tensor-train mechanism based deep model is presented for video human behavior understanding to meet the requirement of IIoVT applications. It can get competitive performance in accuracy and training efficiency with potentiality for combination of artificial intelligence and prefront IIoVT system. On the one hand, to achieve desirable accuracy, we improved the conventional CNN and adopted the recurrent neural network mechanism to enhance the video representation over time, which takes the correlation between consecutive deep feature into consideration. On the other hand, to enhance the inference capacity between the spatial and temporal features, we carry out the self-critical reinforcement learning mechanism in parameter learning stage. Meanwhile, to further reduce parameter storage size to meet requirement for the deployment of deep neural network and edge device, the tensor-train mechanism is used, which transforms the parameter matrix to a tensor space and carry out tensor decomposition mechanism to decrease the number of parameter generated in parameter training. Finally, we conduct extensive experiments to evaluate our scheme, and the results demonstrate that our method can improve the training efficiency and save the memory space for the deep computation model with better accuracy.

引用

页码：7000 / 7008

页数：9

共 24 条

[1] LONG-RANGE COMMUNICATIONS IN UNLICENSED BANDS: THE RISING STARS IN THE IOT AND SMART CITY SCENARIOS
Centenaro, Marco
Vangelista, Lorenzo
Zanella, Andrea
Zorzi, Michele
[J]. IEEE WIRELESS COMMUNICATIONS, 2016, 23 (05) : 60 - 67
[2] Chen X, 2015, PROC IEEE C COMPUT V, P4356
[3] Human action recognition using two-stream attention based LSTM networks
Dai, Cheng
Liu, Xingang
Lai, Jinfeng
[J]. APPLIED SOFT COMPUTING, 2020, 86
[4] Stacked Convolutional Denoising Auto-Encoders for Feature Representation
Du, Bo
Xiong, Wei
Wu, Jia
Zhang, Lefei
Zhang, Liangpei
Tao, Dacheng
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (04) : 1017 - 1027
[5] Video Captioning With Attention-Based LSTM and Semantic Consistency
Gao, Lianli
Guo, Zhao
Zhang, Hanwang
Xu, Xing
Shen, Heng Tao
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (09) : 2045 - 2055
[6] Trends in IoT based solutions for health care: Moving AI to the edge
Greco, Luca
Percannella, Gennaro
Ritrovato, Pierluigi
Tortorella, Francesco
Vento, Mario
[J]. PATTERN RECOGNITION LETTERS, 2020, 135 : 346 - 353
[7] Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?
Hara, Kensho
Kataoka, Hirokatsu
Satoh, Yutaka
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6546 - 6555
[8] Deep neural networks compression learning based on multiobjective evolutionary algorithms
Huang, Junhao
Sun, Weize
Huang, Lei
[J]. NEUROCOMPUTING, 2020, 378 (378) : 260 - 269
[9] Long Activity Video Understanding Using Functional Object-Oriented Network
Jelodar, Ahmad Babaeian
Paulius, David
Sun, Yu
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (07) : 1813 - 1824
[10] Deep network compression based on partial least squares
Jordao, Artur
Yamada, Fernando
Schwartz, William Robson
[J]. NEUROCOMPUTING, 2020, 406 : 234 - 243

← 1 2 3 →