Learning a Predictable and Generative Vector Representation for Objects

被引：460

作者：

Girdhar, Rohit ^{[1
]}

Fouhey, David F. ^{[1
]}

Rodriguez, Mikel ^{[2
]}

Gupta, Abhinav ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Inst Robot, Pittsburgh, PA 15213 USA

[2] Mitre Corp, Mclean, VA USA

来源：

COMPUTER VISION - ECCV 2016, PT VI | 2016年 / 9910卷

基金：

美国国家科学基金会;

关键词：

D O I：

10.1007/978-3-319-46466-4_29

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

What is a good vector representation of an object? We believe that it should be generative in 3D, in the sense that it can produce new 3D objects; as well as be predictable from 2D, in the sense that it can be perceived from 2D images. We propose a novel architecture, called the TL-embedding network, to learn an embedding space with these properties. The network consists of two components: (a) an autoencoder that ensures the representation is generative; and (b) a convolutional network that ensures the representation is predictable. This enables tackling a number of tasks including voxel prediction from 2D images and 3D model retrieval. Extensive experimental analysis demonstrates the usefulness and versatility of this embedding.

引用

页码：484 / 499

页数：16