LIA: Latent Image Animator

被引:0
|
作者
Wang, Yaohui [1 ]
Yang, Di [1 ]
Bremond, Francois [1 ]
Dantcheva, Antitza [1 ]
机构
[1] Univ Cote dAzur, Inria Ctr, 2004 Rte Lucioles, F-06902 Valbonne, France
关键词
Disentanglement; generative adversarial networks; image animation; interpretability; video generation;
D O I
10.1109/TPAMI.2024.3449075
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous animation techniques mainly focus on leveraging explicit structure representations (e.g., meshes or keypoints) for transferring motion from driving videos to source images. However, such methods are challenged with large appearance variations between source and driving data, as well as require complex additional modules to respectively model appearance and motion. Towards addressing these issues, we introduce the Latent Image Animator (LIA), streamlined to animate high-resolution images. LIA is designed as a simple autoencoder that does not rely on explicit representations. Motion transfer in the pixel space is modeled as linear navigation of motion codes in the latent space. Specifically such navigation is represented as an orthogonal motion dictionary learned in a self-supervised manner based on proposed Linear Motion Decomposition (LMD). Extensive experimental results demonstrate that LIA outperforms state-of-the-art on VoxCeleb, TaichiHD, and TED-talk datasets with respect to video quality and spatio-temporal consistency. In addition LIA is well equipped for zero-shot high-resolution image animation. Code, models, and demo video are available at https://github.com/wyhsirius/LIA.
引用
收藏
页码:10829 / 10844
页数:16
相关论文
共 50 条
  • [1] LEO: Generative Latent Image Animator for Human Video Synthesis
    Wang, Yaohui
    Ma, Xin
    Chen, Xinyuan
    Chen, Cunjian
    Dantcheva, Antitza
    Dai, Bo
    Qiao, Yu
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025, 133 (03) : 1277 - 1289
  • [2] VecGAN: Image-to-Image Translation with Interpretable Latent Directions
    Dalva, Yusuf
    Altindis, Said Fahri
    Dundar, Aysegul
    COMPUTER VISION - ECCV 2022, PT XVI, 2022, 13676 : 153 - 169
  • [3] Image-to-Image Translation With Disentangled Latent Vectors for Face Editing
    Dalva, Yusuf
    Pehlivan, Hamza
    Hatipoglu, Oyku Irmak
    Moran, Cansu
    Dundar, Aysegul
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 14777 - 14788
  • [4] Unpaired Image-to-Image Translation via Latent Energy Transport
    Zhao, Yang
    Chen, Changyou
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 16413 - 16422
  • [5] Explainability in image captioning based on the latent space
    Elguendouze, Sofiane
    Hafiane, Adel
    de Souto, Marcilio C. P.
    Halftermeyer, Anais
    NEUROCOMPUTING, 2023, 546
  • [6] Disentangling latent space better for few-shot image-to-image translation
    Liu, Peng
    Wang, Yueyue
    Du, Angang
    Zhang, Liqiang
    Wei, Bin
    Gu, Zhaorui
    Wang, Xiaodong
    Zheng, Haiyong
    Li, Juan
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (02) : 419 - 427
  • [7] Disentangling latent space better for few-shot image-to-image translation
    Peng Liu
    Yueyue Wang
    Angang Du
    Liqiang Zhang
    Bin Wei
    Zhaorui Gu
    Xiaodong Wang
    Haiyong Zheng
    Juan Li
    International Journal of Machine Learning and Cybernetics, 2023, 14 : 419 - 427
  • [8] Exploring Explicit Domain Supervision for Latent Space Disentanglement in Unpaired Image-to-Image Translation
    Lin, Jianxin
    Chen, Zhibo
    Xia, Yingce
    Liu, Sen
    Qin, Tao
    Luo, Jiebo
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (04) : 1254 - 1266
  • [9] Revisiting Learned Image Compression With Statistical Measurement of Latent Representations
    Li, Shaohui
    Dai, Wenrui
    Fang, Yimian
    Zheng, Ziyang
    Fei, Wen
    Xiong, Hongkai
    Zhang, Wei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) : 2891 - 2907
  • [10] A Compact and Semantic Latent Space for Disentangled and Controllable Image Editing
    Lesne, Gwilherm
    Gousseau, Yann
    Ladjal, Said
    Newson, Alasdair
    20TH ACM SIGGRAPH EUROPEAN CONFERENCE ON VISUAL MEDIA PRODUCTION, CVMP 2023, 2023,