Cross-modality motion parameterization for fine-grained video prediction

被引:2
作者
Yan, Yichao [1 ]
Ni, Bingbing [1 ]
Zhang, Wendong [1 ]
Tang, Jun [1 ]
Yang, Xiaokang [1 ]
机构
[1] Shanghai Jiao Tong Univ, 800 Dongchuan Rd, Shanghai 200240, Peoples R China
基金
美国国家科学基金会;
关键词
Video generation; Cross-modality constraint; Adversarial learning;
D O I
10.1016/j.cviu.2019.03.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While predicting video content is challenging given the huge unconstrained searching space, this work explores cross-modality constraints to safeguard the video generation process and seeks improved content prediction. By observing the underlying correspondence between the sound and the object movement, we propose a novel cross-modality video generation network. Via adversarial training, this network directly links sound with the movement parameters of the operated object and automatically outputs corresponding object motion according to the rhythm of the given audio signal. We experiment on both rigid object and non-rigid object motion prediction tasks and show that our method significantly reduces motion uncertainty for the generated video content, with the guidance of the associated audio information.
引用
收藏
页码:11 / 19
页数:9
相关论文
共 50 条
[1]  
[Anonymous], ACM MM
[2]  
[Anonymous], 2014, ADV NEURAL INFORM PR
[3]  
[Anonymous], 2017, CORR
[4]  
[Anonymous], CORR
[5]  
[Anonymous], CORR
[6]  
[Anonymous], 2015, NIPS 15 P 28 INT C N
[7]  
[Anonymous], 2017, COMMUN ACM, DOI DOI 10.1145/3065386
[8]  
[Anonymous], CORR
[9]  
[Anonymous], 2015, CoRR
[10]  
[Anonymous], 2017, CORR