Rendering Involved and Machine-Learning-based Environment Interpretation

被引：2

作者：

Kunbum, Park ^{[1
]}

Tsuchiya, Takeshi ^{[1
]}

机构：

[1] Univ Tokyo, Dept Aeronaut & Astronaut, 7-3-1 Hongo,Bunkyo Ku, Tokyo 1138656, Japan

来源：

2023 IEEE/SICE INTERNATIONAL SYMPOSIUM ON SYSTEM INTEGRATION, SII | 2023年

关键词：

D O I：

10.1109/SII55687.2023.10039317

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

In this work, I present a methodology for interpreting depth maps obtained from the environment. However, unlike previous studies, the geometry is sequentially estimated by combining CNN classifiers and reinforcement modules rather than using an end-to-end approach. It is assumed that the geometry of the environment is determined from a combination of basic shapes (panel, box, cylinder, sphere). Therefore, the inference process is performed by manipulating the basic geometrical shapes in 3D space. Ultimately, the agent's goal is to compare the depth map obtained from its 3D space with the actual map and approximate it as much as possible. Therefore, the inference process of this algorithm is explainable, and the learning process has the advantage of not requiring a process such as the creation a label manually, and it can run on a relatively small network.

引用

页数：5

共 13 条

[1] YOLO3D: End-to-End Real-Time 3D Oriented Object Bounding Box Detection from LiDAR Point Cloud [J].

Ali, Waleed ;

Abdelkarim, Sherif ;

Zidan, Mahmoud ;

Zahran, Mohamed ;

El Sallab, Ahmad .

COMPUTER VISION - ECCV 2018 WORKSHOPS, PT III, 2019, 11131 :716-728

[2] ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial, and Multimap SLAM [J].

Campos, Carlos ;

Elvira, Richard ;

Gomez Rodriguez, Juan J. ;

Montiel, Jose M. M. ;

Tardos, Juan D. .

IEEE TRANSACTIONS ON ROBOTICS, 2021, 37 (06) :1874-1890

[3] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].

Dai, Angela ;

Qi, Charles Ruizhongtai ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554

[4] Fast R-CNN [J].

Girshick, Ross .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448

[5] Rich feature hierarchies for accurate object detection and semantic segmentation [J].

Girshick, Ross ;

Donahue, Jeff ;

Darrell, Trevor ;

Malik, Jitendra .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587

[6] PointPillars: Fast Encoders for Object Detection from Point Clouds [J].

Lang, Alex H. ;

Vora, Sourabh ;

Caesar, Holger ;

Zhou, Lubing ;

Yang, Jiong ;

Beijbom, Oscar .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :12689-12697

[7] SSD: Single Shot MultiBox Detector [J].

Liu, Wei ;

Anguelov, Dragomir ;

Erhan, Dumitru ;

Szegedy, Christian ;

Reed, Scott ;

Fu, Cheng-Yang ;

Berg, Alexander C. .

COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 :21-37

[8] ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras [J].

Mur-Artal, Raul ;

Tardos, Juan D. .

IEEE TRANSACTIONS ON ROBOTICS, 2017, 33 (05) :1255-1262

[9] ORB-SLAM: A Versatile and Accurate Monocular SLAM System [J].

Mur-Artal, Raul ;

Montiel, J. M. M. ;

Tardos, Juan D. .

IEEE TRANSACTIONS ON ROBOTICS, 2015, 31 (05) :1147-1163

[10] You Only Look Once: Unified, Real-Time Object Detection [J].

Redmon, Joseph ;

Divvala, Santosh ;

Girshick, Ross ;

Farhadi, Ali .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :779-788

← 1 2 →