MatrixVT: Efficient Multi-Camera to BEV Transformation for 3D Perception

被引:8
|
作者
Zhou, Hongyu [1 ]
Ge, Zheng [1 ]
Li, Zeming [1 ]
Zhang, Xiangyu [1 ]
机构
[1] MEGVII Technol, Beijing, Peoples R China
基金
国家重点研发计划;
关键词
D O I
10.1109/ICCV51070.2023.00785
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes an efficient multi-camera to Bird's-Eye-View (BEV) view transformation method for 3D perception, dubbed MatrixVT. Existing view transformers either suffer from poor efficiency or rely on device-specific operators, hindering the broad application of BEV models. In contrast, our method generates BEV features efficiently with only convolutions and matrix multiplications (MatMul). Specifically, we propose describing the BEV feature as the MatMul of image feature and a sparse Feature Transporting Matrix (FTM). A Prime Extraction module is then introduced to compress the dimension of image features and reduce FTM's sparsity. Moreover, we propose the Ring & Ray Decomposition to replace the FTM with two matrices and reformulate our pipeline to reduce calculation further. Compared to existing methods, MatrixVT enjoys a faster speed and less memory footprint while remaining deploy-friendly. Extensive experiments on nuScenes and Waymo benchmarks demonstrate that our method is highly efficient but obtains results on par with the SOTA method in object detection and map segmentation tasks.
引用
收藏
页码:8514 / 8523
页数:10
相关论文
共 50 条
  • [21] Efficient 2D-3D Matching for Multi-Camera Visual Localization
    Geppert, Marcel
    Liu, Peidong
    Cui, Zhaopeng
    Pollefeys, Marc
    Sattler, Torsten
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 5972 - 5978
  • [22] 3D reconstruction of a compressible flow by synchronized multi-camera BOS
    Nicolas, F.
    Donjat, D.
    Leon, O.
    Le Besnerais, G.
    Champagnat, F.
    Micheli, F.
    EXPERIMENTS IN FLUIDS, 2017, 58 (05)
  • [23] A new metrological characterization strategy for 3D multi-camera systems
    Michaela Servi
    Francesco Buonamici
    Luca Puggelli
    Yary Volpe
    International Journal on Interactive Design and Manufacturing (IJIDeM), 2021, 15 : 69 - 72
  • [24] SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving
    Wei, Yi
    Zhao, Linqing
    Zheng, Wenzhao
    Zhu, Zheng
    Zhou, Jie
    Lu, Jiwen
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21672 - 21683
  • [25] RetryTRACK: Recovering Misses in Multi-Camera 3D Pedestrian Tracking
    de Andrade, Isabella
    Lima, Joao Paulo
    Teichrieb, Veronica
    2024 37TH SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES, SIBGRAPI 2024, 2024, : 145 - 150
  • [26] 3D reconstruction of a compressible flow by synchronized multi-camera BOS
    F. Nicolas
    D. Donjat
    O. Léon
    G. Le Besnerais
    F. Champagnat
    F. Micheli
    Experiments in Fluids, 2017, 58
  • [27] A new metrological characterization strategy for 3D multi-camera systems
    Servi, Michaela
    Buonamici, Francesco
    Puggelli, Luca
    Volpe, Yary
    INTERNATIONAL JOURNAL OF INTERACTIVE DESIGN AND MANUFACTURING - IJIDEM, 2021, 15 (01): : 69 - 72
  • [28] A Robust Multi-Camera 3D Ellipse Fitting for Contactless Measurements
    Bergamasco, Filippo
    Cosmo, Luca
    Albarelli, Andrea
    Torsello, Andrea
    SECOND JOINT 3DIM/3DPVT CONFERENCE: 3D IMAGING, MODELING, PROCESSING, VISUALIZATION & TRANSMISSION (3DIMPVT 2012), 2012, : 168 - 175
  • [29] A new metrological characterization strategy for 3D multi-camera systems
    Servi, Michaela
    Buonamici, Francesco
    Puggelli, Luca
    Volpe, Yary
    International Journal on Interactive Design and Manufacturing, 2021, 15 (01) : 69 - 72
  • [30] Multi-camera architecture for perception strategies
    Hernandez-Murillo, Enrique
    Aragues, Rosario
    Lopez-Nicolas, Gonzalo
    2019 24TH IEEE INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION (ETFA), 2019, : 1799 - 1804