Multimodal Transformer for Nursing Activity Recognition

被引:22
作者
Ijaz, Momal [1 ]
Diaz, Renato [1 ]
Chen, Chen [1 ,2 ]
机构
[1] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA
[2] Univ Cent Florida, Ctr Res Comp Vis, Orlando, FL 32816 USA
来源
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022 | 2022年
关键词
D O I
10.1109/CVPRW56347.2022.00224
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In an aging population, elderly patient safety is a primary concern at hospitals and nursing homes, which demands for increased nurse care. By performing nurse activity recognition, we can not only make sure that all patients get an equal desired care, but it can also free nurses from manual documentation of activities they perform, leading to a fair and safe place of care for the elderly. In this work, we present a multimodal transformer-based network, which extracts features from skeletal joints and acceleration data, and fuses them to perform nurse activity recognition. Our method achieves state-of-the-art performance of 81.8% accuracy on the benchmark dataset available for nurse activity recognition from the Nurse Care Activity Recognition Challenge. We perform ablation studies to show that our fusion model is better than single modality transformer variants (using only acceleration or skeleton joints data). Our solution also outperforms state-of-the-art ST-GCN, GRU and other classical hand-crafted-feature-based classifier solutions by a margin of 1.6%, on the NCRC dataset. Code is available at https://github.com/Momilijaz96/MMT_for_NCRC.
引用
收藏
页码:2064 / 2073
页数:10
相关论文
共 41 条
[1]  
Alia Sayeda Shamma, 2021, 3 NURSE CARE ACTIVIT
[2]  
[Anonymous], 2015, NAT CONF COMPUT VIS
[3]   Activity recognition from user-annotated acceleration data [J].
Bao, L ;
Intille, SS .
PERVASIVE COMPUTING, PROCEEDINGS, 2004, 3001 :1-17
[4]   A Study on Human Activity Recognition Using Accelerometer Data from Smartphones [J].
Bayat, Akram ;
Pomplun, Marc ;
Tran, Duc A. .
9TH INTERNATIONAL CONFERENCE ON FUTURE NETWORKS AND COMMUNICATIONS (FNC'14) / THE 11TH INTERNATIONAL CONFERENCE ON MOBILE SYSTEMS AND PERVASIVE COMPUTING (MOBISPC'14) / AFFILIATED WORKSHOPS, 2014, 34 :450-457
[5]   Activity Recognition Using ST-GCN with 3D Motion Data [J].
Cao, Xin ;
Kudo, Wataru ;
Ito, Chihiro ;
Shuzo, Masaki ;
Maeda, Eisaku .
UBICOMP/ISWC'19 ADJUNCT: PROCEEDINGS OF THE 2019 ACM INTERNATIONAL JOINT CONFERENCE ON PERVASIVE AND UBIQUITOUS COMPUTING AND PROCEEDINGS OF THE 2019 ACM INTERNATIONAL SYMPOSIUM ON WEARABLE COMPUTERS, 2019, :689-692
[6]   Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].
Carreira, Joao ;
Zisserman, Andrew .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733
[7]  
Chen C, 2014, IEEE ENG MED BIO, P4135, DOI 10.1109/EMBC.2014.6944534
[8]   CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification [J].
Chen, Chun-Fu ;
Fan, Quanfu ;
Panda, Rameswar .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :347-356
[9]  
Chen Xiangning, 2021, ABS210601548 ARXIV
[10]   A Deep Learning Approach to Human Activity Recognition Based on Single Accelerometer [J].
Chen, Yuqing ;
Xue, Yang .
2015 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2015): BIG DATA ANALYTICS FOR HUMAN-CENTRIC SYSTEMS, 2015, :1488-1492