Alleviating Semantics Distortion in Unsupervised Low-Level Image-to-Image Translation via Structure Consistency Constraint

被引:15
作者
Guo, Jiaxian [1 ]
Li, Jiachen [2 ]
Fu, Huan [1 ]
Gong, Mingming [3 ]
Zhang, Kun [4 ,6 ]
Tao, Dacheng [1 ,5 ]
机构
[1] Univ Sydney, Sydney, NSW, Australia
[2] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[3] Univ Melbourne, Melbourne, Vic, Australia
[4] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[5] JD Explore Acad, Beijing, Peoples R China
[6] Mohamed bin Zayed Univ Artificial Intelligence, Abu Dhabi, U Arab Emirates
来源
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年
基金
澳大利亚研究理事会; 美国国家卫生研究院;
关键词
REGISTRATION; MAXIMIZATION;
D O I
10.1109/CVPR52688.2022.01771
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unsupervised image-to-image (I2I) translation aims to learn a domain mapping function that can preserve the semantics of the input images without paired data. However, because the underlying semantics distributions in the source and target domains are often mismatched, current distribution matching-based methods may distort the semantics when matching distributions, resulting in the inconsistency between the input and translated images, which is known as the semantics distortion problem. In this paper, we focus on the low-level I2I translation, where the structure of images is highly related to their semantics. To alleviate semantic distortions in such translation tasks without paired supervision, we propose a novel I2I translation constraint, called Structure Consistency Constraint (SCC), to promote the consistency of image structures by reducing the randomness of color transformation in the translation process. To facilitate estimation and maximization of SCC, we propose an approximate representation of mutual information called relative Squared-loss Mutual Information (rSMI) that enjoys efficient analytic solutions. Our SCC can be easily incorporated into most existing translation models. Quantitative and qualitative comparisons on a range of low-level I2I translation tasks show that translation models with SCC outperform the original models by a significant margin with little additional computational and memory costs.
引用
收藏
页码:18228 / 18238
页数:11
相关论文
共 74 条
  • [1] Battaglia P.W., 2018, Relational inductive biases, deep learning, and graph networks
  • [2] Belghazi MI, 2018, PR MACH LEARN RES, V80
  • [3] Benaim S., 2017, Advances in neural information processing systems, P752
  • [4] Binkowski M., 2018, ARXIV180101401, P1
  • [5] Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks
    Bousmalis, Konstantinos
    Silberman, Nathan
    Dohan, David
    Erhan, Dumitru
    Krishnan, Dilip
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 95 - 104
  • [6] StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation
    Choi, Yunjey
    Choi, Minje
    Kim, Munyoung
    Ha, Jung-Woo
    Kim, Sunghun
    Choo, Jaegul
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8789 - 8797
  • [7] The Cityscapes Dataset for Semantic Urban Scene Understanding
    Cordts, Marius
    Omran, Mohamed
    Ramos, Sebastian
    Rehfeld, Timo
    Enzweiler, Markus
    Benenson, Rodrigo
    Franke, Uwe
    Roth, Stefan
    Schiele, Bernt
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
  • [8] Donahue J., 2016, PROC INT C LEARN REP
  • [9] Dosovitskiy Alexey, 2016, ADV NEURAL INFORM PR, V29
  • [10] Fu H, 2019, PROC CVPR IEEE, P2422, DOI [10.1109/CVPR.2019.00253, 10.1109/cvpr.2019.00253]