SwinTExCo: Exemplar-based video colorization using Swin Transformer

被引:0
作者
Tran, Duong Thanh [1 ]
Nguyen, Nguyen Doan Hieu [1 ]
Pham, Trung Thanh [1 ]
Tran, Phuong-Nam [2 ]
Vu, Thuy-Duong Thi [1 ]
Nguyen, Cuong Tuan [3 ]
Dang-Ngoc, Hanh [4 ]
Dang, Duc Ngoc Minh [1 ]
机构
[1] FPT Univ, Long Thanh My Ward, Dept Comp Fundamental, AiTA Lab, D1 St,Saigon Hi Tech Pk, Ho Chi Minh City 71216, Vietnam
[2] Kyung Hee Univ, Dept Comp Sci & Engn, Yongin 446701, South Korea
[3] Vietnamese German Univ, Thoi Hoa Ward, Fac Engn, Ring Rd 4,Quarter 4, Ben Cat 75000, Binh Duong, Vietnam
[4] Ho Chi Minh City Univ Technol HCMUT, Fac Elect & Elect Engn, VNU HCM, 268 Ly Thuong Kiet,Dist 10, Ho Chi Minh City 72506, Vietnam
关键词
Computer vision; Image colorization; Video colorization; Exemplar-based; Vision transformer; Swin transformer; IMAGE;
D O I
10.1016/j.eswa.2024.125437
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video colorization represents a compelling domain within the field of Computer Vision. The traditional approach in this field relies on Convolutional Neural Networks (CNNs) to extract features from each video frame and employs a recurrent network to learn information between video frames. While demonstrating considerable success in colorization, most traditional CNNs suffer from a limited receptive field size, capturing local information within a fixed-sized window. Consequently, they struggle to directly grasp long-range dependencies or pixel relationships that span large image or video frame areas. To address this limitation, recent advancements in the field have leveraged Vision Transformer (ViT) and their variants to enhance performance. This article introduces Swin Transformer Exemplar-based Video Colorization (SwinTExCo), an end-to-end model for the video colorization process that incorporates the Swin Transformer architecture as the backbone. The experimental results demonstrate that our proposed method outperforms many other state-ofthe-art methods in both quantitative and qualitative metrics. The achievements of this research have significant implications for the domain of documentary and history video restoration, contributing to the broader goal of preserving cultural heritage and facilitating a deeper understanding of historical events through enhanced audiovisual materials.
引用
收藏
页数:14
相关论文
共 50 条
  • [11] Improvement of the exemplar-based inpainting
    Huang, Weijie
    Zhang, Guoshan
    JOURNAL OF ELECTRONIC IMAGING, 2016, 25 (05)
  • [12] Image inpainting using layered fusion and exemplar-based
    Wang, Keke
    Ma, Ran
    Bo, Dezhi
    An, Ping
    OPTOELECTRONIC IMAGING AND MULTIMEDIA TECHNOLOGY VI, 2019, 11187
  • [13] LatentColorization: Latent Diffusion-Based Speaker Video Colorization
    Ward, Rory
    Bigioi, Dan
    Basak, Shubhajit
    Breslin, John G.
    Corcoran, Peter
    IEEE ACCESS, 2024, 12 : 81105 - 81121
  • [14] Optimising Data for Exemplar-Based Inpainting
    Karos, Lena
    Bheed, Pinak
    Peter, Pascal
    Weickert, Joachim
    ADVANCED CONCEPTS FOR INTELLIGENT VISION SYSTEMS, ACIVS 2018, 2018, 11182 : 547 - 558
  • [15] A survey of exemplar-based texture synthesis
    Raad, Lara
    Davy, Axel
    Desolneux, Agnes
    Morel, Jean-Michel
    ANNALS OF MATHEMATICAL SCIENCES AND APPLICATIONS, 2018, 3 (01) : 89 - 148
  • [16] FLOW-GUIDED TRANSFORMER FOR VIDEO COLORIZATION
    Zhai, Yan
    Tao, Zhulin
    Dai, Longquan
    Wang, He
    Huang, Xianglin
    Yang, Lifang
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2485 - 2489
  • [17] A survey of exemplar-based texture synthesis methods
    Akl, Adib
    Yaacoub, Charles
    Donias, Marc
    Da Costa, Jean-Pierre
    Germain, Christian
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2018, 172 : 12 - 24
  • [18] Exemplar-based image inpainting using structure consistent patch matching
    Wang, Haixia
    Jiang, Li
    Liang, Ronghua
    Li, Xiao-Xin
    NEUROCOMPUTING, 2017, 269 : 90 - 96
  • [19] Plant Disease Detection Algorithm Based on Efficient Swin Transformer
    Liu, Wei
    Zhang, Ao
    CMC-COMPUTERS MATERIALS & CONTINUA, 2025, 82 (02): : 3045 - 3068
  • [20] Classification of Mobile-Based Oral Cancer Images Using the Vision Transformer and the Swin Transformer
    Song, Bofan
    Raj, Dharma K. C.
    Yang, Rubin Yuchan
    Li, Shaobai
    Zhang, Chicheng
    Liang, Rongguang
    CANCERS, 2024, 16 (05)