Visual-Inertial Object Detection and Mapping

被引:5
作者
Fei, Xiaohan [1 ]
Soatto, Stefano [1 ]
机构
[1] Univ Calif Los Angeles, UCLA Vis Lab, Los Angeles, CA 90095 USA
来源
COMPUTER VISION - ECCV 2018, PT XI | 2018年 / 11215卷
关键词
TRACKING; VISION;
D O I
10.1007/978-3-030-01252-6_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a method to populate an unknown environment with models of previously seen objects, placed in a Euclidean reference frame that is inferred causally and on-line using monocular video along with inertial sensors. The system we implement returns a sparse point cloud for the regions of the scene that are visible but not recognized as a previously seen object, and a detailed object model and its pose in the Euclidean frame otherwise. The system includes bottom-up and top-down components, whereby deep networks trained for detection provide likelihood scores for object hypotheses provided by a nonlinear filter, whose state serves as memory. Additional networks provide likelihood scores for edges, which complements detection networks trained to be invariant to small deformations. We test our algorithm on existing datasets, and also introduce the VISMA dataset, that provides ground truth pose, point-cloud map, and object models, along with time-stamped inertial measurements.
引用
收藏
页码:318 / 334
页数:17
相关论文
共 45 条
[1]  
[Anonymous], 2016, P EUR WORKSH 3D OBJ
[2]  
[Anonymous], 2018, CORR
[3]  
[Anonymous], 2010, INT J COMPUT VISION, DOI DOI 10.1007/s11263-009-0275-4
[4]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[5]  
Blake A., 1997, ADV NEURAL INFORM PR
[6]  
Bowman S.L., 2017, INT C ROB AUT ICRA
[7]   The EuRoC micro aerial vehicle datasets [J].
Burri, Michael ;
Nikolic, Janosch ;
Gohl, Pascal ;
Schneider, Thomas ;
Rehder, Joern ;
Omari, Sammy ;
Achtelik, Markus W. ;
Siegwart, Roland .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2016, 35 (10) :1157-1163
[9]   Combining monoSLAM with object recognition for scene augmentation using a wearable camera [J].
Castle, R. O. ;
Klein, G. ;
Murray, D. W. .
IMAGE AND VISION COMPUTING, 2010, 28 (11) :1548-1556
[10]  
Choi CH, 2012, IEEE INT C INT ROBOT, P3877, DOI 10.1109/IROS.2012.6386065