SphereNet: Learning Spherical Representations for Detection and Classification in Omnidirectional Images

被引:254
作者
Coors, Benjamin [1 ,2 ,4 ]
Condurache, Alexandru Paul [3 ,4 ]
Geiger, Andreas [1 ,2 ]
机构
[1] MPI Intelligent Syst, Autonomous Vis Grp, Tubingen, Germany
[2] Univ Tubingen, Tubingen, Germany
[3] Univ Lubeck, Inst Signal Proc, Lubeck, Germany
[4] Robert Bosch GmbH, Stuttgart, Germany
来源
COMPUTER VISION - ECCV 2018, PT IX | 2018年 / 11213卷
关键词
D O I
10.1007/978-3-030-01240-3_32
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Omnidirectional cameras offer great benefits over classical cameras wherever a wide field of view is essential, such as in virtual reality applications or in autonomous robots. Unfortunately, standard convolutional neural networks are not well suited for this scenario as the natural projection surface is a sphere which cannot be unwrapped to a plane without introducing significant distortions, particularly in the polar regions. In this work, we present SphereNet, a novel deep learning framework which encodes invariance against such distortions explicitly into convolutional neural networks. Towards this goal, SphereNet adapts the sampling locations of the convolutional filters, effectively reversing distortions, and wraps the filters around the sphere. By building on regular convolutions, SphereNet enables the transfer of existing perspective convolutional neural network models to the omnidirectional case. We demonstrate the effectiveness of our method on the tasks of image classification and object detection, exploiting two newly created semi-synthetic and real-world omnidirectional datasets.
引用
收藏
页码:525 / 541
页数:17
相关论文
共 31 条
[1]  
[Anonymous], 2018, INT C LEARNING REPRE
[2]   Invariant Scattering Convolution Networks [J].
Bruna, Joan ;
Mallat, Stephane .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1872-1886
[3]  
Chang A. X., 2015, ARXIV
[4]  
Cohen TS, 2016, PR MACH LEARN RES, V48
[5]   Deformable Convolutional Networks [J].
Dai, Jifeng ;
Qi, Haozhi ;
Xiong, Yuwen ;
Li, Yi ;
Zhang, Guodong ;
Hu, Han ;
Wei, Yichen .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773
[6]  
Defferrard M., 2016, Advances in Neural Information Processing Systems, DOI DOI 10.5555/3157382.3157527
[7]  
Geiger A., 2012, C COMP VIS PATT REC
[8]  
HE KM, 2016, PROC CVPR IEEE, P770, DOI [10.1109/CVPR.2016.90, DOI 10.1109/CVPR.2016.90]
[9]  
Henriques JF, 2017, PR MACH LEARN RES, V70
[10]  
Hu Hexiang., 2017, CVPR