Human centric attention with deep multiscale feature fusion framework for activity recognition in Internet of Medical Things

被引:9
作者
Hussain, Altaf [1 ]
Khan, Samee Ullah [1 ]
Rida, Imad [2 ]
Khan, Noman [1 ]
Baik, Sung Wook [1 ]
机构
[1] Sejong Univ, Seoul 143747, South Korea
[2] Univ Technol Compiegne, Ctr Rech Royallieu, Lab Biomecan & Bioingn, Compiegne, France
基金
新加坡国家研究基金会;
关键词
Human activity recognition; Multiscale feature fusion; Healthcare activity recognition; Internet of medical things; Information fusion; Artificial intelligence; Surveillance system; FALL DETECTION; VIDEO SURVEILLANCE; LSTM; CNN;
D O I
10.1016/j.inffus.2023.102211
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advancements in the Internet of Medical Things (IoMT) have revolutionized the healthcare sector, making it an active research area in the academic and industrial sectors. Following these advances, an automatic Human Activity Recognition (HAR) is now integrated into the IoMT, facilitating remote patient monitoring systems for smart healthcare. However, implementing HAR via computer vision is intricate due to complex spatiotemporal patterns, single stream fusion, and clutter backgrounds. Mainstream approaches practice pre-trained CNN model, which extract non-salient features due to their generalized weight optimization and limited discriminative feature fusion. In addition, their sequential models have inadequate performance in complex scenarios due to the vanishing gradients encountered during backpropagation across multiple layers. In response to these challenges, we propose a multiscale feature fusion framework for both indoor and outdoor environments to enhance HAR in healthcare monitoring systems, which is mainly composed of two stages: First, the proposed Human Centric Attentional Fusion (HCAF) network is fused with the intermediate convolutional feature of lightweight MobileNetV3 backbone to enriches spatial learning capabilities for accurate HAR. Next, a Deep Multiscale Features Fusion (DMFF) network is proposed that enhanced the long-range temporal dependencies by redesigning the traditional bidirectional LSTM network into a residual fashion followed by Sequential Multihead Attention (SMA) to eliminate non-relevant information and optimized spatiotemporal feature vectors. The performance of the proposed fusion model is evaluated on benchmark healthcare and general activity datasets. In the healthcare, we used Multiple Camera Fall and UR Fall Detection datasets that achieved 99.941% and 100% accuracy. Despite this, our fusion strategy is rigorously evaluated over three challenging general HAR datasets, including HMDB51, UCF101, and UCF50, demonstrating 74.942%, 97.337%, and 96.156% superior performance compared to Stateof-The-Art (SOTA) methods. The run time analysis shows that the proposed method is 2x times faster than the existing methods.
引用
收藏
页数:15
相关论文
共 84 条
  • [1] Vision-based human fall detection systems using deep learning: A review
    Alam, Ekram
    Sufian, Abu
    Dutta, Paramartha
    Leo, Marco
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 146
  • [2] A systematic review of trustworthy and explainable artificial intelligence in healthcare: Assessment of quality, bias risk, and data fusion
    Albahri, A. S.
    Duhaim, Ali M.
    Fadhel, Mohammed A.
    Alnoor, Alhamzah
    Baqer, Noor S.
    Alzubaidi, Laith
    Albahri, O. S.
    Alamoodi, A. H.
    Bai, Jinshuai
    Salhi, Asma
    Santamaria, Jose
    Ouyang, Chun
    Gupta, Ashish
    Gu, Yuantong
    Deveci, Muhammet
    [J]. INFORMATION FUSION, 2023, 96 : 156 - 191
  • [3] A systematic literature review of artificial intelligence in the healthcare sector: Benefits, challenges, methodologies, and functionalities
    Ali, Omar
    Abdelbaki, Wiem
    Shrestha, Anup
    Elbasi, Ersin
    Alryalat, Mohammad Abdallah Ali
    Dwivedi, Yogesh K.
    [J]. JOURNAL OF INNOVATION & KNOWLEDGE, 2023, 8 (01):
  • [4] Auvinet E., 2010, MULTIPLE CAMERAS FAL, P24
  • [5] Fall Detection With Multiple Cameras: An Occlusion-Resistant Method Based on 3-D Silhouette Vertical Distribution
    Auvinet, Edouard
    Multon, Franck
    Saint-Arnaud, Alain
    Rousseau, Jacqueline
    Meunier, Jean
    [J]. IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, 2011, 15 (02): : 290 - 300
  • [6] Attention Augmented Convolutional Networks
    Bello, Irwan
    Zoph, Barret
    Vaswani, Ashish
    Shlens, Jonathon
    Le, Quoc V.
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3285 - 3294
  • [7] Bhandari S, 2017, IEEE GLOB CONF CONSU
  • [8] Activity Recognition based on a Magnitude-Orientation Stream Network
    Caetano, Carlos
    de Melo, Victor H. C.
    dos Santos, Jefersson A.
    Schwartz, William Robson
    [J]. 2017 30TH SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 2017, : 47 - 54
  • [9] Vision-Based Fall Detection With Multi-Task Hourglass Convolutional Auto-Encoder
    Cai, Xi
    Li, Suyuan
    Liu, Xinyue
    Han, Guang
    [J]. IEEE ACCESS, 2020, 8 : 44493 - 44502
  • [10] Multi-View Super Vector for Action Recognition
    Cai, Zhuowei
    Wang, Limin
    Peng, Xiaojiang
    Qiao, Yu
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 596 - 603