AN ATTENTION-SEQ2SEQ MODEL BASED ON CRNN ENCODING FOR AUTOMATIC LABANOTATION GENERATION FROM MOTION CAPTURE DATA

被引:2
作者
Li, Min [1 ]
Miao, Zhenjiang [1 ]
Zhang, Xiao-Ping [2 ]
Xu, Wanru [1 ]
机构
[1] Beijing Jiaotong Univ, Inst Informat Sci, Beijing, Peoples R China
[2] Ryerson Univ, Dept Elect Comp & Biomed Engn, Toronto, ON, Canada
来源
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年
基金
加拿大自然科学与工程研究理事会; 中国国家自然科学基金;
关键词
Labanotation generation; motion capture data; seq2seq model; CRNN; attention;
D O I
10.1109/ICASSP39728.2021.9414976
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Labanotation is an important notation system widely used for recording dances. Numerous methods have been proposed for automatic Labanotation generation from motion capture data. Recently, the sequence-to-sequence (seq2seq) model is proposed. However, the encoder of the model only encodes the temporal information of motion data, lacking the encoding for spatial information. And it is challenging for the decoder to align input and output sequences due to the imbalance of the sequence lengths. In this paper, we propose an attention-seq2seq model based on Convolutional Recurrent Neural Network (CRNN). The proposed model employs an encoder based on CRNN to learn the spatial-temporal information of motion data and applies an attention mechanism to align each target Laban symbol with relevant parts of the input motion data in decoding. Experiments show that the proposed method performs favorably against state-of-the-art algorithms in the automatic Labanotation generation task.
引用
收藏
页码:4185 / 4189
页数:5
相关论文
共 17 条
[1]   Human action recognition using Lie Group features and convolutional neural networks [J].
Cai, Linqin ;
Liu, Chengpeng ;
Yuan, Rongdi ;
Ding, Heen .
NONLINEAR DYNAMICS, 2020, 99 (04) :3253-3263
[2]  
Choensawat W., 2016, SPRINGER TRAC ADV RO
[3]  
Choensawat Worawat, 2015, MULTIMEDIA TOOLS APP
[4]  
Guest A. H., 2014, Labanotation: The System of Analyzing and Recording Movement, V4th
[5]   Method of generating coded description of human body motion from motion-captured data [J].
Hachimura, K ;
Nakamura, M .
ROBOT AND HUMAN COMMUNICATION, PROCEEDINGS, 2001, :122-127
[6]  
Hao S., 2019, IEEE INT C IM PROC I
[7]   Deep Learning on Lie Groups for Skeleton-based Action Recognition [J].
Huang, Zhiwu ;
Wan, Chengde ;
Probst, Thomas ;
Van Gool, Luc .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1243-1252
[8]  
Li M., 2020, IEEE INT C MULT EXP
[9]  
Li M., 2017, IAPR AS C PATT REC A
[10]   Dance Movement Learning for Labanotation Generation Based on Motion-Captured Data [J].
Li, Min ;
Miao, Zhenjiang ;
Ma, Cong .
IEEE ACCESS, 2019, 7 :161561-161572