Cross-Modality Self-Attention and Fusion-Based Neural Network for Lower Limb Locomotion Mode Recognition

被引：2

作者：

Zhao, Changchen ^{[1
]}

Liu, Kai ^{[2
]}

Zheng, Hao ^{[3
]}

Song, Wenbo ^{[4
]}

Pei, Zhongcai ^{[3
]}

Chen, Weihai ^{[5
]}

机构：

[1] Hangzhou Dianzi Univ, Sch Comp Sci, Hangzhou 310018, Peoples R China

[2] Zhejiang Univ Technol, Coll Informat Engn, Hangzhou 310023, Peoples R China

[3] Beihang Univ, Hangzhou Innovat Inst, Hangzhou 310051, Peoples R China

[4] Jilin Normal Univ, Coll Phys Educ, Siping 136000, Peoples R China

[5] Anhui Univ, Sch Elect Engn & Automat, Hefei 230601, Peoples R China

来源：

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING | 2025年 / 22卷

关键词：

Cross-modality interaction; self-attention; locomotion mode recognition; lower limb; neural network; INTENT RECOGNITION; PREDICTION; STRATEGY; GAZE;

D O I：

10.1109/TASE.2024.3421276

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Although there are many wearable sensors that make the acquisition of multi-modality data easier, effective feature extraction and fusion of the data is still challenging for lower limb locomotion mode recognition. In this article, a novel neural network is proposed for accurate prediction of five common lower limb locomotion modes including level walking, ramp ascent, ramp descent, stair ascent, and stair descent. First, the encoder-decoder structure is employed to enrich the channel diversity for the separation of the useful patterns from combined patterns. Second, a self-attention based cross-modality interaction module is proposed, which enables bilateral information flow between two encoding paths to fully exploit the interdependencies and to find complementary information between modalities. Third, a multi-modality fusion module is designed where the complementary features are fused by a channel-wise weighted summation whose coefficients are learned end-to-end. A benchmark dataset is collected from 10 health subjects containing EMG and IMU signals and five locomotion modes. Extensive experiments are conducted on one publicly available dataset ENABL3S and one self-collected dataset. The results show that the proposed method outperforms the compared methods with higher classification accuracy. The proposed method achieves a classification accuracy of 98.25 $\%$ on ENABL3S dataset and 95.51 $\%$ on the self-collected dataset. Note to Practitioners-This article aims to solve the real challenges encountered when intelligent recognition algorithms are applied in wearable robots: how to effectively and efficiently fuse the multi-modality data for better decision-making. First, most existing methods directly concatenate the multi-modality data, which increases the data dimensionality and brings computational burden. Second, existing recognition neural networks continuously compress the feature size such that the discriminative patterns are submerged in the noise and thus difficult to be identified. This research decomposes the mixed input signals on the channel dimension such that the useful patterns can be separated. Moreover, this research employs self-attention mechanism to associate correlations between two modalities and use this correlation as a new feature for subsequent representation learning, generating new, compact, and complementary features for classification. We demonstrate that the proposed network achieves 98.25 $\%$ accuracy and 3.5 ms prediction time. We anticipate that the proposed network could be a general scientific and practical methodology of multi-modality signal fusion and feature learning for intelligent systems.

引用

页码：5411 / 5424

页数：14

共 50 条

[31] Data Mining of Students' Consumption Behaviour Pattern Based on Self-Attention Graph Neural Network
Xu, Fangyao
Qu, Shaojie
APPLIED SCIENCES-BASEL, 2021, 11 (22):
[32] Emotional Stress Recognition Using Electroencephalogram Signals Based on a Three-Dimensional Convolutional Gated Self-Attention Deep Neural Network
Kim, Hyoung-Gook
Jeong, Dong-Ki
Kim, Jin-Young
APPLIED SCIENCES-BASEL, 2022, 12 (21):
[33] Local and global self-attention enhanced graph convolutional network for skeleton-based action recognition
Wu, Zhize
Ding, Yue
Wan, Long
Li, Teng
Nian, Fudong
PATTERN RECOGNITION, 2025, 159
[34] Cross-Modal Method Based on Self-Attention Neural Networks for Drug-Target Prediction
Zhang, Litao
Yang, Chunming
He, Chunlin
Zhang, Hui
NEURAL INFORMATION PROCESSING, ICONIP 2023, PT IV, 2024, 14450 : 3 - 17
[35] Short-term Net Load Forecasting Based on Self-attention Encoder and Deep Neural Network
Wang W.
Feng B.
Huang G.
Liu Z.
Ji W.
Guo C.
Zhongguo Dianji Gongcheng Xuebao/Proceedings of the Chinese Society of Electrical Engineering, 2023, 43 (23): : 9072 - 9083
[36] Intent recognition of power lower-limb prosthesis based on improved convolutional neural network
Su B.-Y.
Ni Y.
Sheng M.
Zhao L.-L.
Kongzhi yu Juece/Control and Decision, 2021, 36 (12): : 3031 - 3038
[37] SPAR: An efficient self-attention network using Switching Partition Strategy for skeleton-based action recognition
Zhu, ZiJie
Ying, RenDong
Wen, Fei
Liu, PeiLin
NEUROCOMPUTING, 2023, 562
[38] Artificial intelligence based classification and prediction of medical imaging using a novel framework of inverted and self-attention deep neural network architecture
Aftab, Junaid
Khan, Muhammad Attique
Arshad, Sobia
Rehman, Shams ur
Alhammadi, Dina Abdulaziz
Nam, Yunyoung
SCIENTIFIC REPORTS, 2025, 15 (01):
[39] Multi-entity sentiment analysis using self-attention based hierarchical dilated convolutional neural network
Gan, Chenquan
Wang, Lu
Zhang, Zufan
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 112 : 116 - 125
[40] SAST-GNN: A Self-Attention Based Spatio-Temporal Graph Neural Network for Traffic Prediction
Xie, Yi
Xiong, Yun
Zhu, Yangyong
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2020), PT I, 2020, 12112 : 707 - +

← 1 2 3 4 5 →