PETR: Position Embedding Transformation for Multi-view 3D Object Detection

被引:160
|
作者
Liu, Yingfei [1 ]
Wang, Tiancai [1 ]
Zhang, Xiangyu [1 ]
Sun, Jian [1 ]
机构
[1] MEGVII Technol, Beijing, Peoples R China
来源
COMPUTER VISION - ECCV 2022, PT XXVII | 2022年 / 13687卷
基金
国家重点研发计划;
关键词
Position embedding; Transformer; 3D object detection;
D O I
10.1007/978-3-031-19812-0_31
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we develop position embedding transformation (PETR) for multi-view 3D object detection. PETR encodes the position information of 3D coordinates into image features, producing the 3D position-aware features. Object query can perceive the 3D position-aware features and perform end-to-end object detection. PETR achieves state-of-the-art performance (50.4% NDS and 44.1% mAP) on standard nuScenes dataset and ranks 1st place on the benchmark. It can serve as a simple yet strong baseline for future research. Code is available at https://github.com/megvii- research/PETR.
引用
收藏
页码:531 / 548
页数:18
相关论文
共 50 条
  • [1] CAPE: Camera View Position Embedding for Multi-View 3D Object Detection
    Xiong, Kaixin
    Gong, Shi
    Ye, Xiaoqing
    Tan, Xiao
    Wan, Ji
    Ding, Errui
    Wang, Jingdong
    Bai, Xiang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21570 - 21579
  • [2] OPEN: Object-Wise Position Embedding for Multi-view 3D Object Detection
    Hou, Jinghua
    Wang, Tong
    Ye, Xiaoqing
    Liu, Zhe
    Gong, Shi
    Tan, Xiao
    Ding, Errui
    Wang, Jingdong
    Bai, Xiang
    COMPUTER VISION - ECCV 2024, PT XXVI, 2025, 15084 : 146 - 162
  • [3] Multi-View Attentive Contextualization for Multi-View 3D Object Detection
    Liu, Xianpeng
    Zheng, Ce
    Qian, Ming
    Xue, Nan
    Chen, Chen
    Zhang, Zhebin
    Li, Chen
    Wu, Tianfu
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 16688 - 16698
  • [4] Multi-View 3D Object Retrieval With Deep Embedding Network
    Guo, Haiyun
    Wang, Jinqiao
    Gao, Yue
    Li, Jianqiang
    Lu, Hanqing
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (12) : 5526 - 5537
  • [5] Viewpoint Equivariance for Multi-View 3D Object Detection
    Chen, Dian
    Li, Jie
    Guizilini, Vitor
    Ambrus, Rares
    Gaidon, Adrien
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 9213 - 9222
  • [6] Multi-View 3D Object Detection Network for Autonomous Driving
    Chen, Xiaozhi
    Ma, Huimin
    Wan, Ji
    Li, Bo
    Xia, Tian
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6526 - 6534
  • [7] Multi-View Object Class Detection with a 3D Geometric Model
    Liebelt, Joerg
    Schmid, Cordelia
    2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 1688 - 1695
  • [8] 3D Object Detection based on Multi-View Feature Point Matching
    Yang, Tian
    Sang, Xinzhu
    Chen, Duo
    Guo, Nan
    Wang, Peng
    Yu, Xunbo
    Yan, Binbin
    Wang, Kuiru
    Yu, Chongxiu
    AI IN OPTICS AND PHOTONICS (AOPC 2019), 2019, 11342
  • [9] AeDet: Azimuth-invariant Multi-view 3D Object Detection
    Feng, Chengjian
    Jie, Zequn
    Zhong, Yujie
    Chu, Xiangxiang
    Ma, Lin
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21580 - 21588
  • [10] BEVDepth: Acquisition of Reliable Depth for Multi-View 3D Object Detection
    Li, Yinhao
    Ge, Zheng
    Yu, Guanyi
    Yang, Jinrong
    Wang, Zengran
    Shi, Yukang
    Sun, Jianjian
    Li, Zeming
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 1477 - 1485