Layered Neural Atlases for Consistent Video Editing

被引:65
作者
Kasten, Yoni [1 ]
Ofri, Dolev [1 ]
Wang, Oliver [2 ]
Dekel, Tali [1 ]
机构
[1] Weizmann Inst Sci, Rehovot, Israel
[2] Adobe Res, Rehovot, Israel
来源
ACM TRANSACTIONS ON GRAPHICS | 2021年 / 40卷 / 06期
关键词
Video editing; image based rendering; video propagation; machine learning;
D O I
10.1145/3478513.3480546
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present a method that decomposes, and "unwraps", an input video into a set of layered 2D atlases, each providing a unified representation of the appearance of an object (or background) over the video. For each pixel in the video, our method estimates its corresponding 2D coordinate in each of the atlases, giving us a consistent parameterization of the video, along with an associated alpha (opacity) value. Importantly, we design our atlases to be interpretable and semantic, which facilitates easy and intuitive editing in the atlas domain, with minimal manual work required. Edits applied to a single 2D atlas (or input video frame) are automatically and consistently mapped back to the original video frames, while preserving occlusions, deformation, and other complex scene effects such as shadows and reflections. Our method employs a coordinate-based Multilayer Perceptron (MLP) representation for mappings, atlases, and alphas, which are jointly optimized on a per-video basis, using a combination of video reconstruction and regularization losses. By operating purely in 2D, our method does not require any prior 3D knowledge about scene geometry or camera poses, and can handle complex dynamic real world videos. We demonstrate various video editing applications, including texture mapping, video style transfer, image-to-video texture transfer, and segmentation/labeling propagation, all automatically produced by editing a single 2D atlas image.
引用
收藏
页数:12
相关论文
共 38 条
  • [1] Adobe, 2021, AD EFF CC
  • [2] Panoramic video textures
    Agarwala, A
    Zheng, KC
    Pal, C
    Agrawala, M
    Cohen, M
    Curless, B
    Salesin, D
    Szeliski, R
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2005, 24 (03): : 821 - 827
  • [3] Arora S, 2019, PR MACH LEARN RES, V97
  • [4] Barnes Connelly, 2010, ACM SIGGRAPH 2010 PA, P1
  • [5] Video Panorama for 2D to 3D Conversion
    Blanco i Ribera, Roger
    Choi, Sungwoo
    Kim, Younghui
    Lee, JungJin
    Noh, Junyong
    [J]. COMPUTER GRAPHICS FORUM, 2012, 31 (07) : 2213 - 2222
  • [6] Automatic panoramic image stitching using invariant features
    Brown, Matthew
    Lowe, David G.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2007, 74 (01) : 59 - 73
  • [7] A Time-Dependent SIR Model for COVID-19 With Undetectable Infected Persons
    Chen, Yi-Cheng
    Lu, Ping-En
    Chang, Cheng-Shang
    Liu, Tzu-Hsuan
    [J]. IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2020, 7 (04): : 3279 - 3294
  • [8] Correa Carlos D, 2010, ACM SIGGRAPH 2010 PA, P1
  • [9] A Papier-Mache Approach to Learning 3D Surface Generation
    Groueix, Thibault
    Fisher, Matthew
    Kim, Vladimir G.
    Russell, Bryan C.
    Aubry, Mathieu
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 216 - 224
  • [10] He K., 2017, IEEE I CONF COMP VIS, P2961, DOI DOI 10.1109/ICCV.2017.322