Cross-Modality Self-Attention and Fusion-Based Neural Network for Lower Limb Locomotion Mode Recognition

被引:2
作者
Zhao, Changchen [1 ]
Liu, Kai [2 ]
Zheng, Hao [3 ]
Song, Wenbo [4 ]
Pei, Zhongcai [3 ]
Chen, Weihai [5 ]
机构
[1] Hangzhou Dianzi Univ, Sch Comp Sci, Hangzhou 310018, Peoples R China
[2] Zhejiang Univ Technol, Coll Informat Engn, Hangzhou 310023, Peoples R China
[3] Beihang Univ, Hangzhou Innovat Inst, Hangzhou 310051, Peoples R China
[4] Jilin Normal Univ, Coll Phys Educ, Siping 136000, Peoples R China
[5] Anhui Univ, Sch Elect Engn & Automat, Hefei 230601, Peoples R China
关键词
Cross-modality interaction; self-attention; locomotion mode recognition; lower limb; neural network; INTENT RECOGNITION; PREDICTION; STRATEGY; GAZE;
D O I
10.1109/TASE.2024.3421276
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Although there are many wearable sensors that make the acquisition of multi-modality data easier, effective feature extraction and fusion of the data is still challenging for lower limb locomotion mode recognition. In this article, a novel neural network is proposed for accurate prediction of five common lower limb locomotion modes including level walking, ramp ascent, ramp descent, stair ascent, and stair descent. First, the encoder-decoder structure is employed to enrich the channel diversity for the separation of the useful patterns from combined patterns. Second, a self-attention based cross-modality interaction module is proposed, which enables bilateral information flow between two encoding paths to fully exploit the interdependencies and to find complementary information between modalities. Third, a multi-modality fusion module is designed where the complementary features are fused by a channel-wise weighted summation whose coefficients are learned end-to-end. A benchmark dataset is collected from 10 health subjects containing EMG and IMU signals and five locomotion modes. Extensive experiments are conducted on one publicly available dataset ENABL3S and one self-collected dataset. The results show that the proposed method outperforms the compared methods with higher classification accuracy. The proposed method achieves a classification accuracy of 98.25 $\%$ on ENABL3S dataset and 95.51 $\%$ on the self-collected dataset. Note to Practitioners-This article aims to solve the real challenges encountered when intelligent recognition algorithms are applied in wearable robots: how to effectively and efficiently fuse the multi-modality data for better decision-making. First, most existing methods directly concatenate the multi-modality data, which increases the data dimensionality and brings computational burden. Second, existing recognition neural networks continuously compress the feature size such that the discriminative patterns are submerged in the noise and thus difficult to be identified. This research decomposes the mixed input signals on the channel dimension such that the useful patterns can be separated. Moreover, this research employs self-attention mechanism to associate correlations between two modalities and use this correlation as a new feature for subsequent representation learning, generating new, compact, and complementary features for classification. We demonstrate that the proposed network achieves 98.25 $\%$ accuracy and 3.5 ms prediction time. We anticipate that the proposed network could be a general scientific and practical methodology of multi-modality signal fusion and feature learning for intelligent systems.
引用
收藏
页码:5411 / 5424
页数:14
相关论文
共 50 条
  • [41] Dynamic facial expression recognition based on spatial key-points optimized region feature fusion and temporal self-attention
    Huang, Zhiwei
    Zhu, Yu
    Li, Hangyu
    Yang, Dawei
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
  • [42] SADLN: Self-attention based deep learning network of integrating multi-omics data for cancer subtype recognition
    Sun, Qiuwen
    Cheng, Lei
    Meng, Ao
    Ge, Shuguang
    Chen, Jie
    Zhang, Longzhen
    Gong, Ping
    FRONTIERS IN GENETICS, 2023, 13
  • [43] Multi-attribute self-attention guided vehicle local region detection based on convolutional neural network architecture
    Chen, Jingbo
    Chen, Shengyong
    Bian, Linjie
    INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2020, 17 (04)
  • [44] A lightweight network for abdominal multi-organ segmentation based on multi-scale context fusion and dual self-attention
    Liao, Miao
    Tang, Hongliang
    Li, Xiong
    Vijayakumar, P.
    Arya, Varsha
    Gupta, Brij B.
    INFORMATION FUSION, 2024, 108
  • [45] Multimodal Fusion of EEG and EMG Signals Using Self-Attention Multi-Temporal Convolutional Neural Networks for Enhanced Hand Gesture Recognition in Rehabilitation
    Zafar, Muhammad Hamza
    Langas, Even Falkenberg
    Nyberg, Svein Olav Glesaaen
    Sanfilippo, Filippo
    2024 IEEE INTERNATIONAL CONFERENCE ON OMNI-LAYER INTELLIGENT SYSTEMS, COINS 2024, 2024, : 245 - 249
  • [46] Human Gait phases recognition based on multi-source data fusion and BILSTM attention neural network
    Zhan, Haoran
    Kou, Jiange
    Cao, Yuanchao
    Guo, Qing
    Zhang, Jiyu
    Shi, Yan
    MEASUREMENT, 2024, 238
  • [47] Life prediction method of automobile electromagnetic relay based on dual self-attention one-dimensional convolutional neural network
    Zhan Li
    Lan Chao
    Zhao Hao
    Guo Ji-feng
    INTERNATIONAL JOURNAL OF MODELLING IDENTIFICATION AND CONTROL, 2021, 38 (3-4) : 312 - 319
  • [48] Skeleton-Based Emotion Recognition Based on Two-Stream Self-Attention Enhanced Spatial-Temporal Graph Convolutional Network
    Shi, Jiaqi
    Liu, Chaoran
    Ishi, Carlos Toshinori
    Ishiguro, Hiroshi
    SENSORS, 2021, 21 (01) : 1 - 16
  • [49] Cross-attention-based hybrid ViT-CNN fusion network for action recognition in visible and infrared videos
    Javed Imran
    Himanshu Gupta
    Pattern Analysis and Applications, 2025, 28 (3)
  • [50] Combining a parallel 2D CNN with a self-attention Dilated Residual Network for CTC-based discrete speech emotion recognition
    Zhao, Ziping
    Li, Qifei
    Zhang, Zixing
    Cummins, Nicholas
    Wang, Haishuai
    Tao, Jianhua
    Schuller, Bjoern W.
    NEURAL NETWORKS, 2021, 141 : 52 - 60