Exploring the Temporal Consistency of Arbitrary Style Transfer: A Channelwise Perspective

被引:17
|
作者
Kong, Xiaoyu [1 ,2 ]
Deng, Yingying [3 ]
Tang, Fan [4 ]
Dong, Weiming [3 ]
Ma, Chongyang [5 ]
Chen, Yongyong
He, Zhenyu [6 ,7 ]
Xu, Changsheng [3 ]
机构
[1] Jilin Univ, Sch Artificial Intelligence, Changchun 130012, Peoples R China
[2] Harbin Inst Technol, Sch Comp Sci & Technol, Shenzhen 518073, Peoples R China
[3] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
[4] Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China
[5] Kuaishou Technol, Beijing 100085, Peoples R China
[6] Harbin Inst Technol, Dept Comp Sci, Shenzhen 518073, Peoples R China
[7] Peng Cheng Lab, Shenzhen 518055, Peoples R China
基金
美国国家科学基金会; 国家重点研发计划;
关键词
Correlation; Task analysis; Optical imaging; Integrated optics; Lighting; Optical fiber networks; Image reconstruction; Arbitrary stylization; channel correlation; cross-domain; feature migration;
D O I
10.1109/TNNLS.2022.3230084
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Arbitrary image stylization by neural networks has become a popular topic, and video stylization is attracting more attention as an extension of image stylization. However, when image stylization methods are applied to videos, unsatisfactory results that suffer from severe flickering effects appear. In this article, we conducted a detailed and comprehensive analysis of the cause of such flickering effects. Systematic comparisons among typical neural style transfer approaches show that the feature migration modules for state-of-the-art (SOTA) learning systems are ill-conditioned and could lead to a channelwise misalignment between the input content representations and the generated frames. Unlike traditional methods that relieve the misalignment via additional optical flow constraints or regularization modules, we focus on keeping the temporal consistency by aligning each output frame with the input frame. To this end, we propose a simple yet efficient multichannel correlation network (MCCNet), to ensure that output frames are directly aligned with inputs in the hidden feature space while maintaining the desired style patterns. An inner channel similarity loss is adopted to eliminate side effects caused by the absence of nonlinear operations such as softmax for strict alignment. Furthermore, to improve the performance of MCCNet under complex light conditions, we introduce an illumination loss during training. Qualitative and quantitative evaluations demonstrate that MCCNet performs well in arbitrary video and image style transfer tasks.
引用
收藏
页码:8482 / 8496
页数:15
相关论文
共 50 条
  • [41] QR code arbitrary style transfer algorithm based on style matching layer
    Hai-Sheng Li
    Jingyin Chen
    Huafeng Huang
    Multimedia Tools and Applications, 2024, 83 : 38505 - 38522
  • [42] Progressive Attentional Manifold Alignment for Arbitrary Style Transfer
    Luo, Xuan
    Han, Zhen
    Yang, Linkang
    COMPUTER VISION - ACCV 2022, PT VII, 2023, 13847 : 134 - 150
  • [43] Rethink arbitrary style transfer with transformer and contrastive learning
    Zhang, Zhanjie
    Sun, Jiakai
    Li, Guangyuan
    Zhao, Lei
    Zhang, Quanwei
    Lan, Zehua
    Yin, Haolin
    Xing, Wei
    Lin, Huaizhong
    Zuo, Zhiwen
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 241
  • [44] Deep Content Guidance Network for Arbitrary Style Transfer
    Shi, Di-Bo
    Xie, Huan
    Ji, Yi
    Li, Ying
    Liu, Chun-Ping
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [45] Panoramic Arbitrary Style Transfer with Deformable Distortion Constraints
    Ye, Wujian
    Wang, Yue
    Liu, Yijun
    Lin, Wenjie
    Xiang, Xin
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2025, 106
  • [46] Exploring temporal consistency for human pose estimation in videos
    Li, Yang
    Li, Kan
    Wang, Xinxin
    Da Xu, Richard Yi
    PATTERN RECOGNITION, 2020, 103
  • [47] Perceptual Embedding Consistency for Seamless Reconstruction of Tilewise Style Transfer
    Lahiani, Amal
    Navab, Nassir
    Albarqouni, Shadi
    Klaiman, Eldad
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT I, 2019, 11764 : 568 - 576
  • [48] Semi-Supervised Formality Style Transfer with Consistency Training
    Liu, Ao
    Wang, An
    Okazaki, Naoaki
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4689 - 4701
  • [49] StyleFormer: Real-time Arbitrary Style Transfer via Parametric Style Composition
    Wu, Xiaolei
    Hu, Zhihao
    Sheng, Lu
    Xu, Dong
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 14598 - 14607
  • [50] CSAST: Content self-supervised and style contrastive learning for arbitrary style transfer
    Zhang, Yuqi
    Tian, Yingjie
    Hou, Junjie
    NEURAL NETWORKS, 2023, 164 : 146 - 155