FLOW-GUIDED TRANSFORMER FOR VIDEO COLORIZATION

被引:0
作者
Zhai, Yan [1 ]
Tao, Zhulin [1 ]
Dai, Longquan [2 ]
Wang, He [2 ]
Huang, Xianglin [1 ]
Yang, Lifang [1 ]
机构
[1] Commun Univ China, Beijing, Peoples R China
[2] Nanjing Univ Sci & Technol, Nanjing, Peoples R China
来源
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP | 2023年
基金
中国国家自然科学基金;
关键词
Video Colorization; Flow-Guided Attention; Transformer;
D O I
10.1109/ICIP49359.2023.10223177
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video colorization aims to add color to black-and-white films. However, propagating color information to the whole video clip accurately is a challenging task. In this paper, we propose Flow-Guided Transformer for Video Colorization (FGTVC), consisting of a Global Motion Aggregation (GMA) module, Residual modules, Flow-Guided Attention blocks (FGAB) based on encoder and decoder, to exploit the information from the neighbor patch with high similarity for each video patch colorization. Specifically, we employ Transformer to capture the long-distance dependencies between frames and learn non-local self-similarity in the frame. To overcome the shortcomings of previous optical flow-based methods, FGAB enjoys the guidance of optical flow to sample elements from spatio-temporal adjacent frames when calculating self-attention. Experiments show that the proposed FGTVC has an outstanding performance than the state-of-the-art methods. In addition, comprehensive findings demonstrate the superiority of our framework in real-world video colorization tasks.
引用
收藏
页码:2485 / 2489
页数:5
相关论文
共 21 条
  • [1] Schelling Points on 3D Surface Meshes
    Chen, Xiaobai
    Saparov, Abulhair
    Pang, Bill
    Funkhouser, Thomas
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2012, 31 (04):
  • [2] Learning Diverse Image Colorization
    Deshpande, Aditya
    Lu, Jiajun
    Yeh, Mao-Chuang
    Chong, Min Jin
    Forsyth, David
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2877 - 2885
  • [3] FlowNet: Learning Optical Flow with Convolutional Networks
    Dosovitskiy, Alexey
    Fischer, Philipp
    Ilg, Eddy
    Haeusser, Philip
    Hazirbas, Caner
    Golkov, Vladimir
    van der Smagt, Patrick
    Cremers, Daniel
    Brox, Thomas
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2758 - 2766
  • [4] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [5] DeepRemaster: Temporal Source-Reference Attention Networks for Comprehensive Video Enhancement
    Iizuka, Satoshi
    Simo-Serra, Edgar
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2019, 38 (06):
  • [6] Learning to Estimate Hidden Motions with Global Motion Aggregation
    Jiang, Shihao
    Campbell, Dylan
    Lu, Yao
    Li, Hongdong
    Hartley, Richard
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9752 - 9761
  • [7] Kim E., 2021, P IEEE CVF INT C COM, P14667
  • [8] Learning Blind Video Temporal Consistency
    Lai, Wei-Sheng
    Huang, Jia-Bin
    Wang, Oliver
    Shechtman, Eli
    Yumer, Ersin
    Yang, Ming-Hsuan
    [J]. COMPUTER VISION - ECCV 2018, PT 15, 2018, 11219 : 179 - 195
  • [9] Fully Automatic Video Colorization with Self-Regularization and Diversity
    Lei, Chenyang
    Chen, Qifeng
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3748 - 3756
  • [10] GUIDED SAMPLING BASED FEATURE AGGREGATION FOR VIDEO OBJECT DETECTION
    Liang, Jun
    Chen, Haosheng
    Yan, Yan
    Lu, Yang
    Wang, Hanzi
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1116 - 1120