Alleviating Semantics Distortion in Unsupervised Low-Level Image-to-Image Translation via Structure Consistency Constraint

被引：15

作者：

Guo, Jiaxian ^{[1
]}

Li, Jiachen ^{[2
]}

Fu, Huan ^{[1
]}

Gong, Mingming ^{[3
]}

Zhang, Kun ^{[4
,6
]}

Tao, Dacheng ^{[1
,5
]}

机构：

[1] Univ Sydney, Sydney, NSW, Australia

[2] Shanghai Jiao Tong Univ, Shanghai, Peoples R China

[3] Univ Melbourne, Melbourne, Vic, Australia

[4] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

[5] JD Explore Acad, Beijing, Peoples R China

[6] Mohamed bin Zayed Univ Artificial Intelligence, Abu Dhabi, U Arab Emirates

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年

基金：

澳大利亚研究理事会; 美国国家卫生研究院;

关键词：

REGISTRATION; MAXIMIZATION;

D O I：

10.1109/CVPR52688.2022.01771

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Unsupervised image-to-image (I2I) translation aims to learn a domain mapping function that can preserve the semantics of the input images without paired data. However, because the underlying semantics distributions in the source and target domains are often mismatched, current distribution matching-based methods may distort the semantics when matching distributions, resulting in the inconsistency between the input and translated images, which is known as the semantics distortion problem. In this paper, we focus on the low-level I2I translation, where the structure of images is highly related to their semantics. To alleviate semantic distortions in such translation tasks without paired supervision, we propose a novel I2I translation constraint, called Structure Consistency Constraint (SCC), to promote the consistency of image structures by reducing the randomness of color transformation in the translation process. To facilitate estimation and maximization of SCC, we propose an approximate representation of mutual information called relative Squared-loss Mutual Information (rSMI) that enjoys efficient analytic solutions. Our SCC can be easily incorporated into most existing translation models. Quantitative and qualitative comparisons on a range of low-level I2I translation tasks show that translation models with SCC outperform the original models by a significant margin with little additional computational and memory costs.

引用

页码：18228 / 18238

页数：11

共 74 条

[1] Battaglia P.W., 2018, Relational inductive biases, deep learning, and graph networks
[2] Belghazi MI, 2018, PR MACH LEARN RES, V80
[3] Benaim S., 2017, Advances in neural information processing systems, P752
[4] Binkowski M., 2018, ARXIV180101401, P1
[5] Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks
Bousmalis, Konstantinos
Silberman, Nathan
Dohan, David
Erhan, Dumitru
Krishnan, Dilip
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 95 - 104
[6] StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation
Choi, Yunjey
Choi, Minje
Kim, Munyoung
Ha, Jung-Woo
Kim, Sunghun
Choo, Jaegul
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8789 - 8797
[7] The Cityscapes Dataset for Semantic Urban Scene Understanding
Cordts, Marius
Omran, Mohamed
Ramos, Sebastian
Rehfeld, Timo
Enzweiler, Markus
Benenson, Rodrigo
Franke, Uwe
Roth, Stefan
Schiele, Bernt
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
[8] Donahue J., 2016, PROC INT C LEARN REP
[9] Dosovitskiy Alexey, 2016, ADV NEURAL INFORM PR, V29
[10] Fu H, 2019, PROC CVPR IEEE, P2422, DOI [10.1109/CVPR.2019.00253, 10.1109/cvpr.2019.00253]

← 1 2 3 4 5 6 7 8 →