MatrixVT: Efficient Multi-Camera to BEV Transformation for 3D Perception

被引:8
|
作者
Zhou, Hongyu [1 ]
Ge, Zheng [1 ]
Li, Zeming [1 ]
Zhang, Xiangyu [1 ]
机构
[1] MEGVII Technol, Beijing, Peoples R China
基金
国家重点研发计划;
关键词
D O I
10.1109/ICCV51070.2023.00785
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes an efficient multi-camera to Bird's-Eye-View (BEV) view transformation method for 3D perception, dubbed MatrixVT. Existing view transformers either suffer from poor efficiency or rely on device-specific operators, hindering the broad application of BEV models. In contrast, our method generates BEV features efficiently with only convolutions and matrix multiplications (MatMul). Specifically, we propose describing the BEV feature as the MatMul of image feature and a sparse Feature Transporting Matrix (FTM). A Prime Extraction module is then introduced to compress the dimension of image features and reduce FTM's sparsity. Moreover, we propose the Ring & Ray Decomposition to replace the FTM with two matrices and reformulate our pipeline to reduce calculation further. Compared to existing methods, MatrixVT enjoys a faster speed and less memory footprint while remaining deploy-friendly. Extensive experiments on nuScenes and Waymo benchmarks demonstrate that our method is highly efficient but obtains results on par with the SOTA method in object detection and map segmentation tasks.
引用
收藏
页码:8514 / 8523
页数:10
相关论文
共 50 条
  • [1] Generalizable Multi-Camera 3D Pedestrian Detection
    Lima, Joao Paulo
    Roberto, Rafael
    Figueiredo, Lucas
    Simoes, Francisco
    Teichrieb, Veronica
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 1232 - 1240
  • [2] Calibrating a multi-camera system for 3D modelling
    Wiles, C
    Davison, A
    IEEE WORKSHOP ON MULTI-VIEW MODELING & ANALYSIS OF VISUAL SCENES (MVIEW'99). PROCEEDINGS, 1999, : 29 - 36
  • [3] Multi-camera system for 3D forensic documentation
    Leipner, Anja
    Baumeister, Rilana
    Thali, Michael J.
    Braun, Marcel
    Dobler, Erika
    Ebert, Lars C.
    FORENSIC SCIENCE INTERNATIONAL, 2016, 261 : 123 - 128
  • [4] Project AutoVision: Localization and 3D Scene Perception for an Autonomous Vehicle with a Multi-Camera System
    Heng, Lionel
    Choi, Benjamin
    Cui, Zhaopeng
    Geppert, Marcel
    Hu, Sixing
    Kuan, Benson
    Liu, Peidong
    Nguyen, Rang
    Yeo, Ye Chuan
    Geiger, Andreas
    Lee, Gim Hee
    Pollefeys, Marc
    Sattler, Torsten
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 4695 - 4702
  • [5] PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images
    Liu, Yingfei
    Yan, Junjie
    Jia, Fan
    Li, Shuailin
    Gao, Aqi
    Wang, Tiancai
    Zhang, Xiangyu
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 3239 - 3249
  • [6] 3D Head Reconstruction using Multi-camera Stream
    Kim, Donghoon
    Dahyot, Rozenn
    2009 13TH INTERNATIONAL MACHINE VISION AND IMAGE PROCESSING CONFERENCE, 2009, : 156 - 161
  • [7] A Simple Baseline for Multi-Camera 3D Object Detection
    Zhang, Yunpeng
    Zheng, Wenzhao
    Zhu, Zheng
    Huang, Guan
    Lu, Jiwen
    Zhou, Jie
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 3507 - 3515
  • [8] CALIBRATION OF A SYNCHRONIZED MULTI-CAMERA SETUP FOR 3D VIDEOCONFERENCING
    Waizenegger, Wolfgang
    Feldmann, Ingo
    2010 3DTV-CONFERENCE: THE TRUE VISION - CAPTURE, TRANSMISSION AND DISPLAY OF 3D VIDEO (3DTV-CON 2010), 2010,
  • [9] Multi-camera 3D Object Reconstruction for Industrial Automation
    Bitzidou, Malamati
    Chrysostomou, Dimitrios
    Gasteratos, Antonios
    ADVANCES IN PRODUCTION MANAGEMENT SYSTEMS: COMPETITIVE MANUFACTURING FOR INNOVATIVE PRODUCTS AND SERVICES, AMPS 2012, PT I, 2013, 397 : 526 - 533
  • [10] Focal-PETR: Embracing Foreground for Efficient Multi-Camera 3D Object Detection
    Wang, Shihao
    Jiang, Xiaohui
    Li, Ying
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 1481 - 1489