FLOW-GUIDED TRANSFORMER FOR VIDEO COLORIZATION

被引：0

作者：

Zhai, Yan ^{[1
]}

Tao, Zhulin ^{[1
]}

Dai, Longquan ^{[2
]}

Wang, He ^{[2
]}

Huang, Xianglin ^{[1
]}

Yang, Lifang ^{[1
]}

机构：

[1] Commun Univ China, Beijing, Peoples R China

[2] Nanjing Univ Sci & Technol, Nanjing, Peoples R China

来源：

2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP | 2023年

基金：

中国国家自然科学基金;

关键词：

Video Colorization; Flow-Guided Attention; Transformer;

D O I：

10.1109/ICIP49359.2023.10223177

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Video colorization aims to add color to black-and-white films. However, propagating color information to the whole video clip accurately is a challenging task. In this paper, we propose Flow-Guided Transformer for Video Colorization (FGTVC), consisting of a Global Motion Aggregation (GMA) module, Residual modules, Flow-Guided Attention blocks (FGAB) based on encoder and decoder, to exploit the information from the neighbor patch with high similarity for each video patch colorization. Specifically, we employ Transformer to capture the long-distance dependencies between frames and learn non-local self-similarity in the frame. To overcome the shortcomings of previous optical flow-based methods, FGAB enjoys the guidance of optical flow to sample elements from spatio-temporal adjacent frames when calculating self-attention. Experiments show that the proposed FGTVC has an outstanding performance than the state-of-the-art methods. In addition, comprehensive findings demonstrate the superiority of our framework in real-world video colorization tasks.

引用

页码：2485 / 2489

页数：5

共 21 条

[1] Schelling Points on 3D Surface Meshes
Chen, Xiaobai
Saparov, Abulhair
Pang, Bill
Funkhouser, Thomas
[J]. ACM TRANSACTIONS ON GRAPHICS, 2012, 31 (04):
[2] Learning Diverse Image Colorization
Deshpande, Aditya
Lu, Jiajun
Yeh, Mao-Chuang
Chong, Min Jin
Forsyth, David
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2877 - 2885
[3] FlowNet: Learning Optical Flow with Convolutional Networks
Dosovitskiy, Alexey
Fischer, Philipp
Ilg, Eddy
Haeusser, Philip
Hazirbas, Caner
Golkov, Vladimir
van der Smagt, Patrick
Cremers, Daniel
Brox, Thomas
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2758 - 2766
[4] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[5] DeepRemaster: Temporal Source-Reference Attention Networks for Comprehensive Video Enhancement
Iizuka, Satoshi
Simo-Serra, Edgar
[J]. ACM TRANSACTIONS ON GRAPHICS, 2019, 38 (06):
[6] Learning to Estimate Hidden Motions with Global Motion Aggregation
Jiang, Shihao
Campbell, Dylan
Lu, Yao
Li, Hongdong
Hartley, Richard
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9752 - 9761
[7] Kim E., 2021, P IEEE CVF INT C COM, P14667
[8] Learning Blind Video Temporal Consistency
Lai, Wei-Sheng
Huang, Jia-Bin
Wang, Oliver
Shechtman, Eli
Yumer, Ersin
Yang, Ming-Hsuan
[J]. COMPUTER VISION - ECCV 2018, PT 15, 2018, 11219 : 179 - 195
[9] Fully Automatic Video Colorization with Self-Regularization and Diversity
Lei, Chenyang
Chen, Qifeng
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3748 - 3756
[10] GUIDED SAMPLING BASED FEATURE AGGREGATION FOR VIDEO OBJECT DETECTION
Liang, Jun
Chen, Haosheng
Yan, Yan
Lu, Yang
Wang, Hanzi
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1116 - 1120

← 1 2 3 →