Adaptive Feature Aggregation Centric Enhance Network for Accurate and Fast Monocular 3-D Object Detection

被引：0

作者：

Lin, Peng-Wei ^{[1
]}

Hsu, Chih-Ming ^{[2
]}

机构：

[1] Natl Taipei Univ Technol, Inst Mech & Elect Engn, Taipei 10608, Taiwan

[2] Natl Taipei Univ Technol, Dept Mech Engn, Taipei 10608, Taiwan

来源：

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT | 2024年 / 73卷

关键词：

Three-dimensional displays; Heating systems; Feature extraction; Accuracy; Detectors; Head; Location awareness; Convolution; Autonomous vehicles; Aggregates; 3-D object detection; attention mechanism; autonomous driving; deep learning (DL); monocular image; real time;

D O I：

10.1109/TIM.2024.3470026

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Three-dimensional object detection is crucial in autonomous driving. Monocular 3-D object detection has become a popular area of research in autonomous driving because of its ease of deployment and cost-effectiveness. In real-world autonomous driving, detectors should be both real time and accurate. These features can be achieved using deep learning (DL). A one-stage center-based object detector is suitable for real-world applications. However, in center-based object detectors, object-centric estimation plays an important role because it significantly influences detection results. To address this issue, we propose a real-time monocular 3-D object detection neural network called the adaptive feature aggregate centric enhance network. The proposed model is an anchor-free and center-based method. To improve accuracy while maintaining inference speed, we propose an adaptive feature aggregation network that aggregates multiscale features by weighting. Furthermore, we propose a centric enhanced module for heatmap prediction to improve the accuracy of object localization and classification. Our model can achieve 34.48 fps using an Nvidia RTX3070 graphy processing unit (GPU). Extensive experiments on the KITTI benchmark demonstrate that our method achieves good average precision (AP) for cars and pedestrians.

引用

页数：13

共 59 条

[1] MonoFENet: Monocular 3D Object Detection With Feature Enhancement Networks [J].

Bao, Wentao ;

Xu, Bin ;

Chen, Zhenzhong .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :2753-2765

[2]

Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, 10.48550/arXiv.2004.10934]

[3] M3D-RPN: Monocular 3D Region Proposal Network for Object Detection [J].

Brazil, Garrick ;

Liu, Xiaoming .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9286-9295

[4]

Caesar H, 2020, PROC CVPR IEEE, P11618, DOI 10.1109/CVPR42600.2020.01164

[5] You Only Look One-level Feature [J].

Chen, Qiang ;

Wang, Yingming ;

Yang, Tong ;

Zhang, Xiangyu ;

Cheng, Jian ;

Sun, Jian .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13034-13043

[6] 3D Object Proposals Using Stereo Imagery for Accurate Object Class Detection [J].

Chen, Xiaozhi ;

Kundu, Kaustav ;

Zhu, Yukun ;

Ma, Huimin ;

Fidler, Sanja ;

Urtasun, Raquel .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (05) :1259-1272

[7] Monocular 3D Object Detection for Autonomous Driving [J].

Chen, Xiaozhi ;

Kundu, Kaustav ;

Zhang, Ziyu ;

Ma, Huimin ;

Fidler, Sanja ;

Urtasun, Raquel .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2147-2156

[8] Dynamic Convolution: Attention over Convolution Kernels [J].

Chen, Yinpeng ;

Dai, Xiyang ;

Liu, Mengchen ;

Chen, Dongdong ;

Yuan, Lu ;

Liu, Zicheng .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11027-11036

[9] MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships [J].

Chen, Yongjian ;

Tai, Lei ;

Sun, Kai ;

Li, Mingyang .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :12090-12099

[10]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

← 1 2 3 4 5 6 →