MultiSense: Cross-labelling and Learning Human Activities Using Multimodal Sensing Data

被引：1

作者：

Zhang, Lan ^{[1
,2
]}

Zheng, Daren ^{[3
]}

Yuan, Mu ^{[3
]}

Han, Feng ^{[3
]}

Wu, Zhengtao ^{[3
]}

Liu, Mengjing ^{[3
]}

Li, Xiang-Yang ^{[3
]}

机构：

[1] Univ Sci & Technol China, Hefei 230026, Anhui, Peoples R China

[2] Hefei Comprehens Natl Sci Ctr, Inst Artificial Intelligence, Hefei 230026, Anhui, Peoples R China

[3] Univ Sci & Technol China, Hefei 230026, Anhui, Peoples R China

来源：

ACM TRANSACTIONS ON SENSOR NETWORKS | 2023年 / 19卷 / 03期

基金：

国家重点研发计划;

关键词：

Multimodel sensing data; cross-labelling; cross-learning;

D O I：

10.1145/3578267

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

To tap into the gold mine of data generated by Internet of Things (IoT) devices with unprecedented volume and value, there is an urgent need to efficiently and accurately label raw sensor data. To this end, we explore and leverage the hidden connections among the multimodal data collected by various sensing devices and propose to let different modal data complement and learn from each other. But it is challenging to align and fuse multimodal data without knowing their perception (and thus the correct labels). In this work, we propose MultiSense, a paradigm for automatically mining potential perception, cross-labelling each modal data, and then updating the learning models for recognizing human activity to achieve higher accuracy or even recognize new activities. We design innovative solutions for segmenting, aligning, and fusing multimodal data from different sensors, as well as model updating mechanism. We implement our framework and conduct comprehensive evaluations on a rich set of data. Our results demonstrate that MultiSense significantly improves the data usability and the power of the learning models. With nine diverse activities performed by users, our framework automatically labels multimodal sensing data generated by five different sensing mechanisms (video, smart watch, smartphone, audio, and wireless-channel) with an average accuracy 98.5%. Furthermore, it enables models of some modalities to learn unknown activities from other modalities and greatly improves the activity recognition ability.

引用

页数：26

共 47 条

[1] Leveraging active learning and conditional mutual information to minimize data annotation in human activity recognition
Adaimi, Rebecca
Thomaz, Edison
[J]. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2019, 3 (03)
[2] homeSound: Real-Time Audio Event Detection Based on High Performance Computing for Behaviour and Surveillance Remote Monitoring
Alsina-Pages, Rosa Ma
Navarro, Joan
Alias, Francesc
Hervas, Marcos
[J]. SENSORS, 2017, 17 (04)
[3] Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
Cao, Zhe
Simon, Tomas
Wei, Shih-En
Sheikh, Yaser
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1302 - 1310
[4] Cheng Heng-Tze, 2013, PROCEEDING 11 ANN IN, P361, DOI 10.1145/2462456.2464438
[5] Semi-supervised Deep Generative Modelling of Incomplete Multi-Modality Emotional Data
Du, Changde
Du, Changying
Wang, Hao
Li, Jinpeng
Zheng, Wei-Long
Lu, Bao-Liang
He, Huiguang
[J]. PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 108 - 116
[6] Data Fusion for Hybrid and Autonomous Time-of-Flight Positioning
Fakhreddine, Aymen
Giustiniano, Domenico
Lenders, Vincent
[J]. 2018 17TH ACM/IEEE INTERNATIONAL CONFERENCE ON INFORMATION PROCESSING IN SENSOR NETWORKS (IPSN), 2018, : 266 - 271
[7] Fujiwara Y, 2014, PR MACH LEARN RES, V32, P784
[8] Human Conversation Analysis Using Attentive Multimodal Networks with Hierarchical Encoder-Decoder
Gu, Yue
Li, Xinyu
Huang, Kaixiang
Fu, Shiyu
Yang, Kangning
Chen, Shuhong
Zhou, Moliang
Marsic, Ivan
[J]. PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 537 - 545
[9] PitchIn: Eavesdropping via Intelligible Speech Reconstruction using Non-Acoustic Sensor Fusion
Han, Jun
Chung, Albert Jin
Tague, Patrick
[J]. 2017 16TH ACM/IEEE INTERNATIONAL CONFERENCE ON INFORMATION PROCESSING IN SENSOR NETWORKS (IPSN), 2017, : 181 - 192
[10] Annotation Efficient Cross-Modal Retrieval with Adversarial Attentive Alignment
Huang, Po-Yao
Kang, Guoliang
Liu, Wenhe
Chang, Xiaojun
Hauptmann, Alexander G.
[J]. PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 1758 - 1767

← 1 2 3 4 5 →