LIA: Latent Image Animator

被引:3
作者
Wang, Yaohui [1 ]
Yang, Di [1 ]
Bremond, Francois [1 ]
Dantcheva, Antitza [1 ]
机构
[1] Univ Cote dAzur, Inria Ctr, 2004 Rte Lucioles, F-06902 Valbonne, France
关键词
Disentanglement; generative adversarial networks; image animation; interpretability; video generation;
D O I
10.1109/TPAMI.2024.3449075
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous animation techniques mainly focus on leveraging explicit structure representations (e.g., meshes or keypoints) for transferring motion from driving videos to source images. However, such methods are challenged with large appearance variations between source and driving data, as well as require complex additional modules to respectively model appearance and motion. Towards addressing these issues, we introduce the Latent Image Animator (LIA), streamlined to animate high-resolution images. LIA is designed as a simple autoencoder that does not rely on explicit representations. Motion transfer in the pixel space is modeled as linear navigation of motion codes in the latent space. Specifically such navigation is represented as an orthogonal motion dictionary learned in a self-supervised manner based on proposed Linear Motion Decomposition (LMD). Extensive experimental results demonstrate that LIA outperforms state-of-the-art on VoxCeleb, TaichiHD, and TED-talk datasets with respect to video quality and spatio-temporal consistency. In addition LIA is well equipped for zero-shot high-resolution image animation. Code, models, and demo video are available at https://github.com/wyhsirius/LIA.
引用
收藏
页码:10829 / 10844
页数:16
相关论文
共 50 条
[21]   Latent Video Transformer [J].
Rakhimov, Ruslan ;
Volkhonskiy, Denis ;
Artemov, Alexey ;
Zorin, Denis ;
Burnaev, Evgeny .
VISAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 5: VISAPP, 2021, :101-112
[22]   Controlling the Latent Space of GANs through Reinforcement Learning: A Case Study on Task-based Image-to-Image Translation [J].
Abbasian, Mahyar ;
Rajabzadeh, Taha ;
Moradipari, Ahmadreza ;
Aqajari, Seyed Amir Hossein ;
Lu, Hongsheng ;
Rahmani, Amir M. .
39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024, 2024, :1061-1063
[23]   Discovering Density-Preserving Latent Space Walks in GANs for Semantic Image Transformations [J].
Li, Guanyue ;
Liu, Yi ;
Wei, Xiwen ;
Zhang, Yang ;
Wu, Si ;
Xu, Yong ;
Wong, Hau-San .
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, :1562-1570
[24]   Dual linear latent space constrained generative adversarial networks for hyperspectral image classification [J].
Mou, Kefen ;
Gao, Sha ;
Deveci, Muhammet ;
Kadry, Seifedine .
APPLIED SOFT COMPUTING, 2025, 174
[25]   Cross-Weather Image Alignment via Latent Generative Model With Intensity Consistency [J].
Zhou, Huabing ;
Ma, Jiayi ;
Tan, Chiu C. ;
Zhang, Yanduo ;
Ling, Haibin .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :5216-5228
[26]   EM-LAST: Effective Multidimensional Latent Space Transport for an Unpaired Image-to-Image Translation With an Energy-Based Model [J].
Han, Giwoong ;
Min, Jinhong ;
Han, Sung Won .
IEEE ACCESS, 2022, 10 :72839-72849
[27]   CLUSTERING BY DIRECTLY DISENTANGLING LATENT SPACE [J].
Ding, Fei ;
Yang, Yin ;
Luo, Feng .
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, :341-345
[28]   Face Identity Disentanglement via Latent Space Mapping [J].
Nitzan, Yotam ;
Bermano, Amit ;
Li, Yangyan ;
Cohen-Or, Daniel .
ACM TRANSACTIONS ON GRAPHICS, 2020, 39 (06)
[29]   Generative AI Enables Synthesizing Cross-Modality Brain Image via Multi-Level-Latent Representation Learning [J].
You, Senrong ;
Yuan, Bin ;
Lyu, Zhihan ;
Chui, Charles K. ;
Chen, C. L. Philip ;
Lei, Baiying ;
Wang, Shuqiang .
IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING, 2024, 10 :1152-1164
[30]   Interpretability Latent Space Method: Exploiting Shapley Representation to Explain Latent Space [J].
Liu, Zitu ;
Li, Jiawang ;
Liu, Yue ;
Liu, Qun ;
Wang, Guoyin ;
Guo, Yike .
2021 7TH INTERNATIONAL CONFERENCE ON BIG DATA AND INFORMATION ANALYTICS, BIGDIA, 2021, :87-92