FETR: Feature Transformer for vehicle-infrastructure cooperative 3D object detection

被引：1

作者：

Yan, Wenchao ^{[1
]}

Cao, Hua ^{[1
]}

Chen, Jiazhong ^{[1
]}

Wu, Tao ^{[1
]}

机构：

[1] Huazhong Univ Sci & Technol, Wuhan, Hubei, Peoples R China

来源：

NEUROCOMPUTING | 2024年 / 600卷

关键词：

Autonomous driving; Vehicle-infrastructure cooperation; Object detection; Feature prediction; Feature enhancement;

D O I：

10.1016/j.neucom.2024.128147

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

3D object detection plays a crucial role in the perception system of autonomous vehicles, however, the vehicle's field of view is restricted due to obstructions from nearby vehicles and buildings. Vehicle-infrastructure cooperation can compensate for the issue of visibility, but due to discrepancies in timestamps between vehicle and infrastructure sensors as well as data transmission delays, there is typically a time asynchrony between vehicle and infrastructure data. Therefore, Feature Transformer (FETR) has been introduced, which is a vehicle-infrastructure cooperative 3D object detection model utilizing Transformer as a Feature Predictor. The Transformer Predictor is capable of predicting features of future frame based on the current frame features, efficiently addressing the problem of time asynchrony. Additionally, to enhance the precision of 3D object detection, we have introduced a plug-and-play module named Mask Feature Enhancement (MFE), MFE employs a mask to amplify the features in the object region while simultaneously diminishing the features of the surrounding environment, enlarging the difference between object features and environmental features, thereby improving the detection effect. Experimental results show that FETR attains a 68.15 BEV-mAP (IoU=0.5) on the DAIR-V2X dataset, with a 200ms latency, and the data transmission is merely 6 . 0 x 10 4 bytes, constituting just 4.2% of the original point cloud data, outperforming current vehicle-infrastructure cooperative models in terms of both precision and data transmission.

引用

页数：10

共 50 条

[1] CenterCoop: Center-Based Feature Aggregation for Communication-Efficient Vehicle-Infrastructure Cooperative 3D Object Detection
Zhou, Linyi
Gan, Zhongxue
Fan, Jiayuan
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (04) : 3570 - 3577
[2] Occlusion-guided multi-modal fusion for vehicle-infrastructure cooperative 3D object detection
Chu, Huazhen
Liu, Haizhuang
Zhuo, Junbao
Chen, Jiansheng
Ma, Huimin
PATTERN RECOGNITION, 2025, 157
[3] Vehicle-infrastructure cooperative 3D target detection based on Feature Prediction Atrous Spatial Pyramid Pooling Net
Shaohua Wang
Yunxiang Gan
Yicheng Li
Kecheng Sun
Multimedia Tools and Applications, 2025, 84 (18) : 19273 - 19288
[4] 3D Detection and Pose Estimation of Vehicle in Cooperative Vehicle Infrastructure System
Guo, Ente
Chen, Zhifeng
Rahardja, Susanto
Yang, Jingjing
IEEE SENSORS JOURNAL, 2021, 21 (19) : 21759 - 21771
[5] ViT-FuseNet: Multimodal Fusion of Vision Transformer for Vehicle-Infrastructure Cooperative Perception
Zhou, Yang
Yang, Cai
Wang, Ping
Wang, Chao
Wang, Xinhong
Van, Nguyen Ngoc
IEEE ACCESS, 2024, 12 : 31640 - 31651
[6] Safety Field-Based Vehicle-Infrastructure Cooperative Perception for Autonomous Driving Using 3D Point Clouds
Zhao, Cong
Ding, Delong
Lei, Cailin
Wang, Shiyu
Ji, Yuxiong
Du, Yuchuan
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2025, 26 (04) : 4676 - 4691
[7] Cooperative Perception for 3D Object Detection in Driving Scenarios Using Infrastructure Sensors
Arnold, Eduardo
Dianati, Mehrdad
de Temple, Robert
Fallah, Saber
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (03) : 1852 - 1864
[8] Monocular 3D Object Detection With Motion Feature Distillation
Hu, Henan
Li, Muyu
Zhu, Ming
Gao, Wen
Liu, Peiyu
Chan, Kwok-Leung
IEEE ACCESS, 2023, 11 : 82933 - 82945
[9] Adaptive Feature Fusion Based Cooperative 3D Object Detection for Autonomous Driving
Wang, Junyong
Zeng, Yuan
Gong, Yi
2022 3RD INFORMATION COMMUNICATION TECHNOLOGIES CONFERENCE (ICTC 2022), 2022, : 103 - 107
[10] VI-BEV: Vehicle-Infrastructure Collaborative Perception for 3-D Object Detection on Bird's-Eye View
Meng, Jingxiong
Zhao, Junfeng
IEEE OPEN JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS, 2025, 6 : 256 - 265

← 1 2 3 4 5 →