Where and What? Examining Interpretable Disentangled Representations

被引:21
作者
Zhu, Xinqi [1 ]
Xu, Chang [1 ]
Tao, Dacheng [1 ]
机构
[1] Univ Sydney, Sydney, NSW, Australia
来源
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年
基金
澳大利亚研究理事会;
关键词
D O I
10.1109/CVPR46437.2021.00580
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Capturing interpretable variations has long been one of the goals in disentanglement learning. However, unlike the independence assumption, interpretability has rarely been exploited to encourage disentanglement in the unsupervised setting. In this paper, we examine the interpretability of disentangled representations by investigating two questions: where to be interpreted and what to be interpreted? A latent code is easily to be interpreted if it would consistently impact a certain subarea of the resulting generated image. We thus propose to learn a spatial mask to localize the effect of each individual latent dimension. On the other hand, interpretability usually comes from latent dimensions that capture simple and basic variations in data. We thus impose a perturbation on a certain dimension of the latent code, and expect to identify the perturbation along this dimension from the generated images so that the encoding of simple variations can be enforced. Additionally, we develop an unsupervised model selection method, which accumulates perceptual distance scores along axes in the latent space. On various datasets, our models can learn high-quality disentangled representations without supervision, showing the proposed modeling of interpretability is an effective proxy for achieving unsupervised disentanglement.
引用
收藏
页码:5857 / 5866
页数:10
相关论文
共 60 条
[51]  
Ulyanov D., 2016, PR MACH LEARN RES
[52]   Improved Texture Networks: Maximizing Quality and Diversity in Feed-forward Stylization and Texture Synthesis [J].
Ulyanov, Dmitry ;
Vedaldi, Andrea ;
Lempitsky, Victor .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4105-4113
[53]  
van Steenkiste Sjoerd, 2019, ADV NEUR IN
[54]  
Vaswani A, 2017, ADV NEUR IN, V30
[55]   Unsupervised Disentangling of Appearance and Geometry by Deformable Generator Network [J].
Xing, Xianglei ;
Han, Tian ;
Gao, Ruiqi ;
Zhu, Song-Chun ;
Wu, Ying Nian .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :10346-10355
[56]   Attribute2Image: Conditional Image Generation from Visual Attributes [J].
Yan, Xinchen ;
Yang, Jimei ;
Sohn, Kihyuk ;
Lee, Honglak .
COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :776-791
[57]  
Yichen Shen, 2019, 2019 24th OptoElectronics and Communications Conference (OECC) and 2019 International Conference on Photonics in Switching and Computing (PSC), DOI 10.23919/PS.2019.8817791
[58]   Fine-Grained Visual Comparisons with Local Learning [J].
Yu, Aron ;
Grauman, Kristen .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :192-199
[59]  
Zhao S., 2017, PR MACH LEARN RES, P4091
[60]  
Zhu X., 2020, ECCV