Exploring Negatives in Contrastive Learning for Unpaired Image-to-Image Translation

被引：10

作者：

Lin, Yupei ^{[1
]}

Zhang, Sen ^{[2
]}

Chen, Tianshui ^{[1
]}

Lu, Yongyi ^{[1
]}

Li, Guangping ^{[1
]}

Shi, Yukai ^{[1
]}

机构：

[1] Guangdong Univ Technol, Guangzhou, Peoples R China

[2] Univ Sydney, Sydney, NSW, Australia

来源：

PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022 | 2022年

基金：

中国博士后科学基金; 中国国家自然科学基金;

关键词：

contrastive learning; image-to-image translation; generative adversarial network;

D O I：

10.1145/3503161.3547802

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Unpaired image-to-image translation aims to find a mapping between the source domain and the target domain. To alleviate the problem of the lack of supervised labels for the source images, cycle-consistency based methods have been proposed for image structure preservation by assuming a reversible relationship between unpaired images. However, this assumption only uses limited correspondence between image pairs. Recently, contrastive learning (CL) has been used to further investigate the image correspondence in unpaired image translation by using patch-based positive/negative learning. Patch-based contrastive routines obtain the positives by self-similarity computation and recognize the rest patches as negatives. This flexible learning paradigm obtains auxiliary contextualized information at a low cost. As the negatives own an impressive sample number, with curiosity, we make an investigation based on a question: are all negatives necessary for feature contrastive learning? Unlike previous CL approaches that use negatives as much as possible, in this paper, we study the negatives from an information-theoretic perspective and introduce a new negative Pruning technology for Unpaired image-to-image Translation (PUT) by sparsifying and ranking the patches. The proposed algorithm is efficient, flexible and enables the model to learn essential information between corresponding patches stably. By putting quality over quantity, only a few negative patches are required to achieve better results. Lastly, we validate the superiority, stability, and versatility of our model through comparative experiments.

引用

页码：1186 / 1194

页数：9

共 44 条

[1] [Anonymous], 2020, INT C MACH LEARN
[2] Rethinking the Truly Unsupervised Image-to-Image Translation
Baek, Kyungjune
Choi, Yunjey
Uh, Youngjung
Yoo, Jaejun
Shim, Hyunjung
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 14134 - 14143
[3] Benaim S, 2017, ADV NEUR IN, V30
[4] Cai Mu, 2021, P IEEE CVF INT C COM, P13930
[5] The Cityscapes Dataset for Semantic Urban Scene Understanding
Cordts, Marius
Omran, Mohamed
Ramos, Sebastian
Rehfeld, Timo
Enzweiler, Markus
Benenson, Rodrigo
Franke, Uwe
Roth, Stefan
Schiele, Bernt
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
[6] Fu H, 2019, PROC CVPR IEEE, P2422, DOI [10.1109/CVPR.2019.00253, 10.1109/cvpr.2019.00253]
[7] Gatys L. A., 2015, J Vision, DOI [DOI 10.1167/16.12.326, 10.1167/16.12.326]
[8] Disentangled Cycle Consistency for Highly-realistic Virtual Try-On
Ge, Chongjian
Song, Yibing
Ge, Yuying
Yang, Han
Liu, Wei
Luo, Ping
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 16923 - 16932
[9] Goodfellow I., 2014, ADV NEUR IN, V27, P2672, DOI DOI 10.1145/3422622
[10] Gutmann M.U., 2010, INT C ARTIFICIAL INT

← 1 2 3 4 5 →