DVDS: A deep visual dynamic slam system

被引:4
|
作者
Xie, Tao [1 ]
Sun, Qihao [1 ]
Sun, Tao [1 ]
Zhang, Jinhang [1 ]
Dai, Kun [1 ]
Zhao, Lijun [1 ]
Wang, Ke [1 ]
Li, Ruifeng [1 ]
机构
[1] Harbin Inst Technol, State Key Lab Robot & Syst, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
simultaneous localization and mapping; Transformer; Deep learning; VERSATILE;
D O I
10.1016/j.eswa.2024.125438
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Simultaneous localization and mapping (SLAM) utilizing visual sensors represent an extensively investigated research area, holding significant potential for advancements in robotics and autonomous vehicular systems. Recently, dense SLAM systems underpinned by learning-based methodologies have showcased superior accuracy and robustness compared to conventional techniques. Nevertheless, contemporary learning-based SLAM systems exhibit notable discrepancies in pose estimation, particularly within dynamic environments. In addition, the constrained receptive field of convolutional features in these methods impedes their efficacy when confronted with homogeneous, texture-less images, rendering them vulnerable to noise perturbations. We develop a novel deep visual dynamic slam (DVDS) system that exploits solely static pixels within images to retrieve the camera poses. Specifically, we formulate a dynamic object exclusion mechanism that excises dynamic constituents within the scene before the optical flow computation, thus optimizing the precision of the estimation. In addition, we unveil an efficient dispersive transformer (DisFormer) that facilitates per-pixel features in assimilating long-range information from surrounding features, culminating in constructing more precise 4D correlation volumes. Building on the DisFormer, we suggest a Disformer-based gated recurrent unit (GRU) to generate a refined flow field coupled with a confidence map, which is subsequently employed by the dense bundle adjustment layer to iteratively rectify the residuals of inverse depths and associated camera poses. The global receptive field provided by the DisFormer promotes information integration from a wider contextual window, thus improving the robustness of our SLAM system. Comprehensive experiments underscore that our proposed DVDS system manifests superior efficacy compared with state-of-the-art works across both static and dynamic scenes.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] DFD-SLAM: Visual SLAM with Deep Features in Dynamic Environment
    Qian, Wei
    Peng, Jiansheng
    Zhang, Hongyu
    APPLIED SCIENCES-BASEL, 2024, 14 (11):
  • [2] A Survey of Deep Learning Application in Dynamic Visual SLAM
    Lai, Dongcheng
    Zhang, Yunjian
    Li, Congduan
    2020 INTERNATIONAL CONFERENCE ON BIG DATA & ARTIFICIAL INTELLIGENCE & SOFTWARE ENGINEERING (ICBASE 2020), 2020, : 279 - 283
  • [3] A Robust Visual SLAM System in Dynamic Environment
    Ma, Huajun
    Qin, Yijun
    Duan, Shukai
    Wang, Lidan
    ADVANCES IN NEURAL NETWORKS-ISNN 2024, 2024, 14827 : 248 - 257
  • [4] Visual SLAM algorithm in dynamic environment based on deep learning
    Yu, Yingjie
    Chen, Shuai
    Yang, Xinpeng
    Xu, Changzhen
    Zhang, Sen
    Xiao, Wendong
    INDUSTRIAL ROBOT-THE INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH AND APPLICATION, 2025, 52 (01): : 28 - 35
  • [5] Dynamic SLAM: A Visual SLAM in Outdoor Dynamic Scenes
    Wen, Shuhuan
    Li, Xiongfei
    Liu, Xin
    Li, Jiaqi
    Tao, Sheng
    Long, Yidan
    Qiu, Tony
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [6] DOA-SLAM: An Efficient Stereo Visual SLAM System in Dynamic Environment
    Zhaoqian Jia
    Yixiao Ma
    Junwen Lai
    Zhiguo Wang
    International Journal of Control, Automation and Systems, 2025, 23 (4) : 1181 - 1198
  • [7] DMS-SLAM: semantic visual SLAM based on deep mask segmentation in dynamic environments
    Gao, Shuyuan
    Zhang, Minhui
    Gao, Xicheng
    Zhang, Dawei
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2025, 36 (04)
  • [8] Deep Patch Visual SLAM
    Lipson, Lahav
    Teed, Zachary
    Deng, Jia
    COMPUTER VISION - ECCV 2024, PT II, 2025, 15060 : 424 - 440
  • [9] Dynamic visual SLAM based on probability screening and weighting for deep features
    Fu, Fuji
    Yang, Jinfu
    Ma, Jiaqi
    Zhang, Jiahui
    MEASUREMENT, 2024, 236
  • [10] Deep learning-based visual slam for indoor dynamic scenes
    Xu, Zhendong
    Song, Yong
    Pang, Bao
    Xu, Qingyang
    Yuan, Xianfeng
    APPLIED INTELLIGENCE, 2025, 55 (06)