Where and What? Examining Interpretable Disentangled Representations

被引:21
作者
Zhu, Xinqi [1 ]
Xu, Chang [1 ]
Tao, Dacheng [1 ]
机构
[1] Univ Sydney, Sydney, NSW, Australia
来源
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年
基金
澳大利亚研究理事会;
关键词
D O I
10.1109/CVPR46437.2021.00580
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Capturing interpretable variations has long been one of the goals in disentanglement learning. However, unlike the independence assumption, interpretability has rarely been exploited to encourage disentanglement in the unsupervised setting. In this paper, we examine the interpretability of disentangled representations by investigating two questions: where to be interpreted and what to be interpreted? A latent code is easily to be interpreted if it would consistently impact a certain subarea of the resulting generated image. We thus propose to learn a spatial mask to localize the effect of each individual latent dimension. On the other hand, interpretability usually comes from latent dimensions that capture simple and basic variations in data. We thus impose a perturbation on a certain dimension of the latent code, and expect to identify the perturbation along this dimension from the generated images so that the encoding of simple variations can be enforced. Additionally, we develop an unsupervised model selection method, which accumulates perceptual distance scores along axes in the latent space. On various datasets, our models can learn high-quality disentangled representations without supervision, showing the proposed modeling of interpretability is an effective proxy for achieving unsupervised disentanglement.
引用
收藏
页码:5857 / 5866
页数:10
相关论文
共 60 条
[1]  
[Anonymous], 2018, arXiv preprint arXiv:1812.02833
[2]  
[Anonymous], 2019, ICML
[3]  
Bau David, 2019, P INT C LEARN REPR I
[4]   Representation Learning: A Review and New Perspectives [J].
Bengio, Yoshua ;
Courville, Aaron ;
Vincent, Pascal .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828
[5]  
Bouchacourt D, 2018, AAAI CONF ARTIF INTE, P2095
[6]  
Cao J., 2018, ARXIV180508019
[7]   GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond [J].
Cao, Yue ;
Xu, Jiarui ;
Lin, Stephen ;
Wei, Fangyun ;
Hu, Han .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, :1971-1980
[8]  
Chen R.T., 2018, Advances in neural information processing systems, P6572
[9]  
Chen X, 2016, 30 C NEURAL INFORM P, V29
[10]  
Christopher P., 2018, NIPS WORKSH LEARN DI