LANIT: Language-Driven Image-to-Image Translation for Unlabeled Data

被引:8
作者
Park, Jihye [1 ]
Kim, Sunwoo [1 ]
Kim, Soohyun [1 ]
Cho, Seokju [1 ]
Yoo, Jaejun [2 ]
Uh, Youngjung [3 ]
Kim, Seungryong [1 ]
机构
[1] Korea Univ, Seoul, South Korea
[2] UNIST, Ulsan, South Korea
[3] Yonsei Univ, Seoul, South Korea
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年
基金
新加坡国家研究基金会;
关键词
D O I
10.1109/CVPR52729.2023.02241
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing techniques for image-to-image translation commonly have suffered from two critical problems: heavy reliance on per-sample domain annotation and/or inability to handle multiple attributes per image. Recent trulyunsupervised methods adopt clustering approaches to easily provide per-sample one-hot domain labels. However, they cannot account for the real-world setting: one sample may have multiple attributes. In addition, the semantics of the clusters are not easily coupled to human understanding. To overcome these, we present LANguage-driven Image-to-image Translation model, dubbed LANIT. We leverage easy-to-obtain candidate attributes given in texts for a dataset: the similarity between images and attributes indicates persample domain labels. This formulation naturally enables multi-hot labels so that users can specify the target domain with a set of attributes in language. To account for the case that the initial prompts are inaccurate, we also present prompt learning. We further present domain regularization loss that enforces translated images to be mapped to the corresponding domain. Experiments on several standard benchmarks demonstrate that LANIT achieves comparable or superior performance to existing models. The code is available at github.com/KU-CVLAB/LANIT.
引用
收藏
页码:23401 / 23411
页数:11
相关论文
共 67 条
[1]  
Abdal Rameen, 2021, ARXIV211205219
[2]  
[Anonymous], 2021, CVPR, DOI DOI 10.1109/CVPR46437.2021.01064
[3]  
[Anonymous], 2020, CVPR
[4]  
[Anonymous], 2021, ICML
[5]  
[Anonymous], 2021, CVPR, DOI DOI 10.1109/CVPR46437.2021.00649
[6]  
[Anonymous], 2021, C COMP VIS PATT REC, DOI DOI 10.1109/CVPR46437.2021.01614
[7]  
[Anonymous], 2019, ICML
[8]  
[Anonymous], 2021, ICML
[9]  
[Anonymous], 2020, CVPR, DOI DOI 10.1109/CVPR42600.2020.00519
[10]  
[Anonymous], 2021, CVPR, DOI DOI 10.1109/TPAMI.2020.2970919