Efficient deep learning-based semantic mapping approach using monocular vision for resource-limited mobile robots

被引:4
作者
Singh, Aditya [1 ,2 ]
Narula, Raghav [2 ,3 ]
Rashwan, Hatem A. [1 ]
Abdel-Nasser, Mohamed [4 ]
Puig, Domenec [1 ]
Nandi, G. C. [2 ]
机构
[1] Univ Rovira & Virgili, Dept Comp Engn & Math, Tarragona, Spain
[2] Indian Inst Informat Technol, Ctr Intelligent Robots, Allahabad, Uttar Pradesh, India
[3] Thapar Inst Engn & Technol, Patiala, Punjab, India
[4] Aswan Univ, Dept Elect Engn, Aswan, Egypt
关键词
Visual odometry; Object detection; Household robots; Mapping; Agglomerative clustering; REAL-TIME; SLAM;
D O I
10.1007/s00521-022-07273-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semantic mapping is still challenging for household collaborative robots. Deep learning models have proved their capability to extract semantics from the scene and learn robot odometry. For interfacing semantic information with robot odometry, existing approaches extract both semantics and robot odometry separately and then integrate them using fusion techniques. Such approaches face many issues while integration, and the mapping procedure requires a lot of memory and resources to process the information. In an attempt to produce accurate semantic mapping with resource-limited devices, this paper proposes an efficient deep learning-based model to simultaneously estimate robot odometry by using monocular sequence frames and detecting objects in the frames. The proposed model includes two main components: using a YOLOv3 object detector as a backbone and a convolutional long short-term (Conv-LSTM) recurrent neural network to model the changes in camera pose. The unique advantage of the proposed model is that it boycotts the need for data association and the requirement of multi-sensor fusion. We conducted the experiments on a LoCoBot robot in a laboratory environment, attaining satisfactory results with such limited computational resources. Additionally, we tested the proposed method on the Kitti dataset, reaching an average test loss of 15.93 on various sequences. The experiments are documented in this video https://www.youtube.com/watch?v=hnmqwxpaTEw.
引用
收藏
页码:15617 / 15631
页数:15
相关论文
共 46 条
  • [1] HIFA: Promising Heterogeneous Solar Irradiance Forecasting Approach Based on Kernel Mapping
    Abdel-Nasser, Mohamed
    Mahmoud, Karar
    Lehtonen, Matti
    [J]. IEEE ACCESS, 2021, 9 : 144906 - 144915
  • [2] Accurate photovoltaic power forecasting models using deep LSTM-RNN
    Abdel-Nasser, Mohamed
    Mahmoud, Karar
    [J]. NEURAL COMPUTING & APPLICATIONS, 2019, 31 (07) : 2727 - 2740
  • [3] Bi-LSTM-CRF Sequence Labeling for Keyphrase Extraction from Scholarly Documents
    Al-Zaidy, Rabah A.
    Caragea, Cornelia
    Giles, C. Lee
    [J]. WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, : 2551 - 2557
  • [4] Alsadik B., 2021, Journal of Applied Science and Technology Trends, V2, P120, DOI [10.38094/jastt204117, DOI 10.38094/JASTT204117, /10.38094/sgej1027, DOI 10.38094/SGEJ1027]
  • [5] [Anonymous], 2000, P 13 INT C NEUR INF
  • [6] Monocular Visual Odometry Based on Depth and Optical Flow Using Deep Learning
    Ban, Xicheng
    Wang, Hongjian
    Chen, Tao
    Wang, Ying
    Xiao, Yao
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70
  • [7] Efficient agglomerative hierarchical clustering
    Bouguettaya, Athman
    Yu, Qi
    Liu, Xumin
    Zhou, Xiangmin
    Song, Andy
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (05) : 2785 - 2797
  • [8] Chekhlov D, 2006, LECT NOTES COMPUT SC, V4292, P276
  • [9] Topological and Semantic Map Generation for Mobile Robot Indoor Navigation
    Chen, Yujing
    Zhang, Jinmin
    Lou, Yunjiang
    [J]. INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2021, PT I, 2021, 13013 : 337 - 347
  • [10] Inverse Depth Parametrization for Monocular SLAM
    Civera, Javier
    Davison, Andrew J.
    Montiel, J. M. Martinez
    [J]. IEEE TRANSACTIONS ON ROBOTICS, 2008, 24 (05) : 932 - 945