Breathing Life Into Sketches Using Text-to-Video Priors

被引:2
作者
Gal, Rinon [1 ,2 ]
Vinker, Yael [1 ]
Alaluf, Yuval [1 ]
Bermano, Amit [1 ]
Cohen-Or, Daniel [1 ]
Shamir, Ariel [3 ]
Chechik, Gal [2 ]
机构
[1] Tel Aviv Univ, Tel Aviv, Israel
[2] NVIDIA, Santa Clara, CA 95051 USA
[3] Reichman Univ, Herzliyya, Israel
来源
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024 | 2024年
关键词
D O I
10.1109/CVPR52733.2024.00414
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A sketch is one of the most intuitive and versatile tools humans use to convey their ideas visually. An animated sketch opens another dimension to the expression of ideas and is widely used by designers for a variety of purposes. Animating sketches is a laborious process, requiring extensive experience and professional design skills. In this work, we present a method that automatically adds motion to a single-subject sketch (hence, "breathing life into it"), merely by providing a text prompt indicating the desired motion. The output is a short animation provided in vector representation, which can be easily edited. Our method does not require extensive training, but instead leverages the motion prior of a large pretrained text-to-video diffusion model using a score-distillation loss to guide the placement of strokes. To promote natural and smooth motion and to better preserve the sketch's appearance, we model the learned motion through two components. The first governs small local deformations and the second controls global affine transformations. Surprisingly, we find that even models that struggle to generate sketch videos on their own can still serve as a useful backbone for animating abstract representations.
引用
收藏
页码:4325 / 4336
页数:12
相关论文
共 99 条
  • [1] Keyframe-based tracking for rotoscoping and animation
    Agarwala, A
    Hertzmann, A
    Salesin, DH
    Seitz, SM
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2004, 23 (03): : 584 - 591
  • [2] An J., 2023, ARXIV
  • [3] Pleistocene cave art from Sulawesi, Indonesia
    Aubert, M.
    Brumm, A.
    Ramli, M.
    Sutikna, T.
    Saptomo, E. W.
    Hakim, B.
    Morwood, M. J.
    van den Bergh, G. D.
    Kinsley, L.
    Dosseto, A.
    [J]. NATURE, 2014, 514 (7521) : 223 - +
  • [4] Babaeizadeh Mohammad, 2017, arXiv
  • [5] A Benchmark for Surface Reconstruction
    Berger, Matthew
    Levine, Joshua A.
    Nonato, Luis Gustavo
    Taubin, Gabriel
    Silva, Claudio T.
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2013, 32 (02):
  • [6] Pixelor: A Competitive Sketching AI Agent. So you think you can sketch?
    Bhunia, Ayan Kumar
    Das, Ayan
    Muhammad, Umar Riaz
    Yang, Yongxin
    Hospedales, Timothy M.
    Xiang, Tao
    Gryaditskaya, Yulia
    Song, Yi-Zhe
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2020, 39 (06):
  • [7] Bhunia Ayan Kumar, 2022, ECCV
  • [8] Blattmann Andreas, 2023, IEEE C COMP VIS PATT
  • [9] Bregler C, 2002, ACM T GRAPHIC, V21, P399, DOI 10.1145/566570.566595
  • [10] Improved Conditional VRNNs for Video Prediction
    Castrejon, Lluis
    Ballas, Nicolas
    Courville, Aaron
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 7607 - 7616