Motion2language, unsupervised learning of synchronized semantic motion segmentation

被引:1
|
作者
Radouane, Karim [1 ]
Tchechmedjiev, Andon [1 ]
Lagarde, Julien [2 ]
Ranwez, Sylvie [1 ]
机构
[1] Univ Montpellier, IMT Mines Ales, EuroMov Digital Hlth Mot, Ales, France
[2] Univ Montpellier, IMT Mines Ales, EuroMov Digital Hlth Mot, Montpellier, France
关键词
Unsupervised learning; Semantic segmentation; Synchronized transcription; GRU; Local recurrent attention; WHOLE-BODY MOTION;
D O I
10.1007/s00521-023-09227-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we investigate building a sequence to sequence architecture for motion-to-language translation and synchronization. The aim is to translate motion capture inputs into English natural-language descriptions, such that the descriptions are generated synchronously with the actions performed, enabling semantic segmentation as a byproduct, but without requiring synchronized training data. We propose a new recurrent formulation of local attention that is suited for synchronous/live text generation, as well as an improved motion encoder architecture better suited to smaller data and for synchronous generation. We evaluate both contributions in individual experiments, using the standard BLEU4 metric, as well as a simple semantic equivalence measure, on the KIT motion-language dataset. In a follow-up experiment, we assess the quality of the synchronization of generated text in our proposed approaches through multiple evaluation metrics. We find that both contributions to the attention mechanism and the encoder architecture additively improve the quality of generated text (BLEU and semantic equivalence), but also of synchronization.
引用
收藏
页码:4401 / 4420
页数:20
相关论文
共 50 条
  • [41] Adversarial unsupervised domain adaptation for 3D semantic segmentation with multi-modal learning
    Liu, Wei
    Luo, Zhiming
    Cai, Yuanzheng
    Yu, Ying
    Ke, Yang
    Marcato Junior, Jose
    Goncalves, Wesley Nunes
    Li, Jonathan
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2021, 176 : 211 - 221
  • [42] Prototype and Context-Enhanced Learning for Unsupervised Domain Adaptation Semantic Segmentation of Remote Sensing Images
    Gao, Kuiliang
    Yu, Anzhu
    You, Xiong
    Qiu, Chunping
    Liu, Bing
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [43] Video Demo: Unsupervised Learning of Depth and Ego-Motion from Cylindrical Panoramic Video
    Sharma, Alisha
    Ventura, Jonathan
    2019 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND VIRTUAL REALITY (AIVR), 2019, : 255 - 256
  • [44] Unsupervised Learning from Motion Sensor Data to Assess the Condition of Patients with Parkinson's Disease
    Matic, Teodora
    Aghanavesi, Somayeh
    Memedi, Mevludin
    Nyholm, Dag
    Bergquist, Filip
    Groznik, Vida
    Zabkar, Jure
    Sadikov, Aleksander
    ARTIFICIAL INTELLIGENCE IN MEDICINE, AIME 2019, 2019, 11526 : 420 - 424
  • [45] Unsupervised Learning of Monocular Depth and Ego-Motion with Optical Flow Features and Multiple Constraints
    Zhao, Baigan
    Huang, Yingping
    Ci, Wenyan
    Hu, Xing
    SENSORS, 2022, 22 (04)
  • [46] Unsupervised Learning for Depth, Ego-Motion, and Optical Flow Estimation Using Coupled Consistency Conditions
    Mun, Ji-Hun
    Jeon, Moongu
    Lee, Byung-Geun
    SENSORS, 2019, 19 (11)
  • [47] Spike-Based Motion Estimation for Object Tracking Through Bio-Inspired Unsupervised Learning
    Zheng, Yajing
    Yu, Zhaofei
    Wang, Song
    Huang, Tiejun
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 335 - 349
  • [48] Unsupervised Learning of Depth and Ego-Motion from Cylindrical Panoramic Video with Applications for Virtual Reality
    Sharma, Alisha
    Nett, Ryan
    Ventura, Jonathan
    INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2020, 14 (03) : 333 - 356
  • [49] Unsupervised Learning of a Hierarchical Spiking Neural Network for Optical Flow Estimation: From Events to Global Motion Perception
    Paredes-Valles, Federico
    Scheper, Kirk Y. W.
    de Croon, Guido C. H. E.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (08) : 2051 - 2064
  • [50] MetaSegNet: Metadata-Collaborative Vision-Language Representation Learning for Semantic Segmentation of Remote Sensing Images
    Wang, Libo
    Dong, Sijun
    Chen, Ying
    Meng, Xiaoliang
    Fang, Shenghui
    Fei, Songlin
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62