Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop

被引:61
作者
Biggs, Benjamin [1 ]
Boyne, Oliver [1 ]
Charles, James [1 ]
Fitzgibbon, Andrew [2 ]
Cipolla, Roberto [1 ]
机构
[1] Univ Cambridge, Dept Engn, Cambridge, England
[2] Microsoft, Cambridge, England
来源
COMPUTER VISION - ECCV 2020, PT XI | 2020年 / 12356卷
关键词
D O I
10.1007/978-3-030-58621-8_12
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce an automatic, end-to-end method for recovering the 3D pose and shape of dogs from monocular internet images. The large variation in shape between dog breeds, significant occlusion and low quality of internet images makes this a challenging problem. We learn a richer prior over shapes than previous work, which helps regularize parameter estimation. We demonstrate results on the Stanford Dog Dataset, an 'in the wild' dataset of 20,580 dog images for which we have collected 2D joint and silhouette annotations to split for training and evaluation. In order to capture the large shape variety of dogs, we show that the natural variation in the 2D dataset is enough to learn a detailed 3D prior through expectation maximization (EM). As a byproduct of training, we generate a new parameterized model (including limb scaling) SMBLD which we release alongside our new annotation dataset StanfordExtra to the research community.
引用
收藏
页码:195 / 211
页数:17
相关论文
共 33 条
[1]   Image Collection Pop-up: 3D Reconstruction and Clustering of Rigid and Non-Rigid Categories [J].
Agudo, Antonio ;
Pijoan, Melcior ;
Moreno-Noguer, Francesc .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :2607-2615
[2]   Learning to Reconstruct People in Clothing from a Single RGB Camera [J].
Alldieck, Thiemo ;
Magnor, Marcus ;
Bhatnagar, Bharat Lal ;
Theobalt, Christian ;
Pons-Moll, Gerard .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :1175-1186
[3]  
American Pet Products Association, 2020, 2019-2020 APPA National Pet Owners Survey
[4]   2D Human Pose Estimation: New Benchmark and State of the Art Analysis [J].
Andriluka, Mykhaylo ;
Pishchulin, Leonid ;
Gehler, Peter ;
Schiele, Bernt .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :3686-3693
[5]  
[Anonymous], 2017, 2017 DAVIS CHALL VID
[6]   Creatures Great and SMAL: Recovering the Shape and Motion of Animals from Video [J].
Biggs, Benjamin ;
Roddick, Thomas ;
Fitzgibbon, Andrew ;
Cipolla, Roberto .
COMPUTER VISION - ACCV 2018, PT V, 2019, 11365 :3-19
[7]   Cross-Domain Adaptation for Animal Pose Estimation [J].
Cao, Jinkun ;
Tang, Hongyang ;
Fang, Hao-Shu ;
Shen, Xiaoyong ;
Lu, Cewu ;
Tai, Yu-Wing .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9497-9506
[8]   What Shape Are Dolphins? Building 3D Morphable Models from 2D Images [J].
Cashman, Thomas J. ;
Fitzgibbon, Andrew W. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (01) :232-244
[9]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[10]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338