Efficient deep learning-based semantic mapping approach using monocular vision for resource-limited mobile robots

被引：4

作者：

Singh, Aditya ^{[1
,2
]}

Narula, Raghav ^{[2
,3
]}

Rashwan, Hatem A. ^{[1
]}

Abdel-Nasser, Mohamed ^{[4
]}

Puig, Domenec ^{[1
]}

Nandi, G. C. ^{[2
]}

机构：

[1] Univ Rovira & Virgili, Dept Comp Engn & Math, Tarragona, Spain

[2] Indian Inst Informat Technol, Ctr Intelligent Robots, Allahabad, Uttar Pradesh, India

[3] Thapar Inst Engn & Technol, Patiala, Punjab, India

[4] Aswan Univ, Dept Elect Engn, Aswan, Egypt

来源：

NEURAL COMPUTING & APPLICATIONS | 2022年 / 34卷 / 18期

关键词：

Visual odometry; Object detection; Household robots; Mapping; Agglomerative clustering; REAL-TIME; SLAM;

D O I：

10.1007/s00521-022-07273-7

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Semantic mapping is still challenging for household collaborative robots. Deep learning models have proved their capability to extract semantics from the scene and learn robot odometry. For interfacing semantic information with robot odometry, existing approaches extract both semantics and robot odometry separately and then integrate them using fusion techniques. Such approaches face many issues while integration, and the mapping procedure requires a lot of memory and resources to process the information. In an attempt to produce accurate semantic mapping with resource-limited devices, this paper proposes an efficient deep learning-based model to simultaneously estimate robot odometry by using monocular sequence frames and detecting objects in the frames. The proposed model includes two main components: using a YOLOv3 object detector as a backbone and a convolutional long short-term (Conv-LSTM) recurrent neural network to model the changes in camera pose. The unique advantage of the proposed model is that it boycotts the need for data association and the requirement of multi-sensor fusion. We conducted the experiments on a LoCoBot robot in a laboratory environment, attaining satisfactory results with such limited computational resources. Additionally, we tested the proposed method on the Kitti dataset, reaching an average test loss of 15.93 on various sequences. The experiments are documented in this video https://www.youtube.com/watch?v=hnmqwxpaTEw.

引用

页码：15617 / 15631

页数：15

共 46 条

[1] HIFA: Promising Heterogeneous Solar Irradiance Forecasting Approach Based on Kernel Mapping
Abdel-Nasser, Mohamed
Mahmoud, Karar
Lehtonen, Matti
[J]. IEEE ACCESS, 2021, 9 : 144906 - 144915
[2] Accurate photovoltaic power forecasting models using deep LSTM-RNN
Abdel-Nasser, Mohamed
Mahmoud, Karar
[J]. NEURAL COMPUTING & APPLICATIONS, 2019, 31 (07) : 2727 - 2740
[3] Bi-LSTM-CRF Sequence Labeling for Keyphrase Extraction from Scholarly Documents
Al-Zaidy, Rabah A.
Caragea, Cornelia
Giles, C. Lee
[J]. WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, : 2551 - 2557
[4] Alsadik B., 2021, Journal of Applied Science and Technology Trends, V2, P120, DOI [10.38094/jastt204117, DOI 10.38094/JASTT204117, /10.38094/sgej1027, DOI 10.38094/SGEJ1027]
[5] [Anonymous], 2000, P 13 INT C NEUR INF
[6] Monocular Visual Odometry Based on Depth and Optical Flow Using Deep Learning
Ban, Xicheng
Wang, Hongjian
Chen, Tao
Wang, Ying
Xiao, Yao
[J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70
[7] Efficient agglomerative hierarchical clustering
Bouguettaya, Athman
Yu, Qi
Liu, Xumin
Zhou, Xiangmin
Song, Andy
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (05) : 2785 - 2797
[8] Chekhlov D, 2006, LECT NOTES COMPUT SC, V4292, P276
[9] Topological and Semantic Map Generation for Mobile Robot Indoor Navigation
Chen, Yujing
Zhang, Jinmin
Lou, Yunjiang
[J]. INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2021, PT I, 2021, 13013 : 337 - 347
[10] Inverse Depth Parametrization for Monocular SLAM
Civera, Javier
Davison, Andrew J.
Montiel, J. M. Martinez
[J]. IEEE TRANSACTIONS ON ROBOTICS, 2008, 24 (05) : 932 - 945

← 1 2 3 4 5 →