MultiSense: Cross-labelling and Learning Human Activities Using Multimodal Sensing Data

被引:1
作者
Zhang, Lan [1 ,2 ]
Zheng, Daren [3 ]
Yuan, Mu [3 ]
Han, Feng [3 ]
Wu, Zhengtao [3 ]
Liu, Mengjing [3 ]
Li, Xiang-Yang [3 ]
机构
[1] Univ Sci & Technol China, Hefei 230026, Anhui, Peoples R China
[2] Hefei Comprehens Natl Sci Ctr, Inst Artificial Intelligence, Hefei 230026, Anhui, Peoples R China
[3] Univ Sci & Technol China, Hefei 230026, Anhui, Peoples R China
基金
国家重点研发计划;
关键词
Multimodel sensing data; cross-labelling; cross-learning;
D O I
10.1145/3578267
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To tap into the gold mine of data generated by Internet of Things (IoT) devices with unprecedented volume and value, there is an urgent need to efficiently and accurately label raw sensor data. To this end, we explore and leverage the hidden connections among the multimodal data collected by various sensing devices and propose to let different modal data complement and learn from each other. But it is challenging to align and fuse multimodal data without knowing their perception (and thus the correct labels). In this work, we propose MultiSense, a paradigm for automatically mining potential perception, cross-labelling each modal data, and then updating the learning models for recognizing human activity to achieve higher accuracy or even recognize new activities. We design innovative solutions for segmenting, aligning, and fusing multimodal data from different sensors, as well as model updating mechanism. We implement our framework and conduct comprehensive evaluations on a rich set of data. Our results demonstrate that MultiSense significantly improves the data usability and the power of the learning models. With nine diverse activities performed by users, our framework automatically labels multimodal sensing data generated by five different sensing mechanisms (video, smart watch, smartphone, audio, and wireless-channel) with an average accuracy 98.5%. Furthermore, it enables models of some modalities to learn unknown activities from other modalities and greatly improves the activity recognition ability.
引用
收藏
页数:26
相关论文
共 47 条
  • [1] Leveraging active learning and conditional mutual information to minimize data annotation in human activity recognition
    Adaimi, Rebecca
    Thomaz, Edison
    [J]. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2019, 3 (03)
  • [2] homeSound: Real-Time Audio Event Detection Based on High Performance Computing for Behaviour and Surveillance Remote Monitoring
    Alsina-Pages, Rosa Ma
    Navarro, Joan
    Alias, Francesc
    Hervas, Marcos
    [J]. SENSORS, 2017, 17 (04)
  • [3] Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
    Cao, Zhe
    Simon, Tomas
    Wei, Shih-En
    Sheikh, Yaser
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1302 - 1310
  • [4] Cheng Heng-Tze, 2013, PROCEEDING 11 ANN IN, P361, DOI 10.1145/2462456.2464438
  • [5] Semi-supervised Deep Generative Modelling of Incomplete Multi-Modality Emotional Data
    Du, Changde
    Du, Changying
    Wang, Hao
    Li, Jinpeng
    Zheng, Wei-Long
    Lu, Bao-Liang
    He, Huiguang
    [J]. PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 108 - 116
  • [6] Data Fusion for Hybrid and Autonomous Time-of-Flight Positioning
    Fakhreddine, Aymen
    Giustiniano, Domenico
    Lenders, Vincent
    [J]. 2018 17TH ACM/IEEE INTERNATIONAL CONFERENCE ON INFORMATION PROCESSING IN SENSOR NETWORKS (IPSN), 2018, : 266 - 271
  • [7] Fujiwara Y, 2014, PR MACH LEARN RES, V32, P784
  • [8] Human Conversation Analysis Using Attentive Multimodal Networks with Hierarchical Encoder-Decoder
    Gu, Yue
    Li, Xinyu
    Huang, Kaixiang
    Fu, Shiyu
    Yang, Kangning
    Chen, Shuhong
    Zhou, Moliang
    Marsic, Ivan
    [J]. PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 537 - 545
  • [9] PitchIn: Eavesdropping via Intelligible Speech Reconstruction using Non-Acoustic Sensor Fusion
    Han, Jun
    Chung, Albert Jin
    Tague, Patrick
    [J]. 2017 16TH ACM/IEEE INTERNATIONAL CONFERENCE ON INFORMATION PROCESSING IN SENSOR NETWORKS (IPSN), 2017, : 181 - 192
  • [10] Annotation Efficient Cross-Modal Retrieval with Adversarial Attentive Alignment
    Huang, Po-Yao
    Kang, Guoliang
    Liu, Wenhe
    Chang, Xiaojun
    Hauptmann, Alexander G.
    [J]. PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 1758 - 1767