SMooDi: Stylized Motion Diffusion Model

被引:0
作者
Zhong, Lei [1 ]
Xie, Yiming [1 ]
Jampani, Varun [2 ]
Sun, Deqing [3 ]
Jiang, Huaizu [1 ]
机构
[1] Northeastern Univ, Boston, MA 02115 USA
[2] Stabil AI, London, England
[3] Google Res, Mountain View, CA USA
来源
COMPUTER VISION-ECCV 2024, PT I | 2025年 / 15059卷
关键词
Motion synthesis; Diffusion model; Stylized motion;
D O I
10.1007/978-3-031-73232-4_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a novel Stylized Motion Diffusion model, dubbed SMooDi, to generate stylized motion driven by content texts and style motion sequences. Unlike existing methods that either generate motion of various content or transfer style from one sequence to another, SMooDi can rapidly generate motion across a broad range of content and diverse styles. To this end, we tailor a pre-trained text-to-motion model for stylization. Specifically, we propose style guidance to ensure that the generated motion closely matches the reference style, alongside a lightweight style adaptor that directs the motion towards the desired style while ensuring realism. Experiments across various applications demonstrate that our proposed framework outperforms existing methods in stylized motion generation. Project Page: https://neu-vi.github.io/SMooDi/
引用
收藏
页码:405 / 421
页数:17
相关论文
共 65 条
  • [1] Unpaired Motion Style Transfer from Video to Animation
    Aberman, Kfir
    Weng, Yijia
    Lischinski, Dani
    Cohen-Or, Daniel
    Chen, Baoquan
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2020, 39 (04):
  • [2] Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion Models
    Alexanderson, Simon
    Nagy, Rajmund
    Beskow, Jonas
    Henter, Gustav Eje
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2023, 42 (04):
  • [3] GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents
    Ao, Tenglong
    Zhang, Zeyi
    Liu, Libin
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2023, 42 (04):
  • [4] Cen Z., 2024, CVPR
  • [5] Chen Lichang, 2024, arXiv
  • [6] Executing your Commands via Motion Diffusion in Latent Space
    Chen, Xin
    Jiang, Biao
    Liu, Wen
    Huang, Zilong
    Fu, Bin
    Chen, Tao
    Yu, Gang
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 18000 - 18010
  • [7] Cohan S., 2024, arXiv
  • [8] MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis
    Dabral, Rishabh
    Mughal, Muhammad Hamza
    Golyanik, Vladislav
    Theobalt, Christian
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 9760 - 9770
  • [9] Dai WX, 2024, Arxiv, DOI arXiv:2404.19759
  • [10] Dhariwal P, 2021, ADV NEUR IN, V34