Learning a Predictable and Generative Vector Representation for Objects

被引:460
作者
Girdhar, Rohit [1 ]
Fouhey, David F. [1 ]
Rodriguez, Mikel [2 ]
Gupta, Abhinav [1 ]
机构
[1] Carnegie Mellon Univ, Inst Robot, Pittsburgh, PA 15213 USA
[2] Mitre Corp, Mclean, VA USA
来源
COMPUTER VISION - ECCV 2016, PT VI | 2016年 / 9910卷
基金
美国国家科学基金会;
关键词
D O I
10.1007/978-3-319-46466-4_29
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
What is a good vector representation of an object? We believe that it should be generative in 3D, in the sense that it can produce new 3D objects; as well as be predictable from 2D, in the sense that it can be perceived from 2D images. We propose a novel architecture, called the TL-embedding network, to learn an embedding space with these properties. The network consists of two components: (a) an autoencoder that ensures the representation is generative; and (b) a convolutional network that ensures the representation is predictable. This enables tackling a number of tasks including voxel prediction from 2D images and 3D model retrieval. Extensive experimental analysis demonstrates the usefulness and versatility of this embedding.
引用
收藏
页码:484 / 499
页数:16
相关论文
共 37 条
[1]  
[Anonymous], 2015, CVPR
[2]  
[Anonymous], 2015, ICCV
[3]  
[Anonymous], 2014, CVPR
[4]  
[Anonymous], 2015, CVPR
[5]  
[Anonymous], ICCV
[6]  
[Anonymous], 2015, ABS150201852 CORR
[7]  
[Anonymous], 2014, CoRR
[8]  
[Anonymous], BMVC
[9]  
[Anonymous], 2016, ABS160400449 CORR
[10]  
[Anonymous], 2015, IROS