DSC3D: Deformable Sampling Constraints in Stereo 3D Object Detection for Autonomous Driving

被引:0
作者
Chen, Jiawei [1 ]
Song, Qi [2 ]
Guo, Wenzhong [1 ]
Huang, Rui [2 ]
机构
[1] Fuzhou Univ, Coll Comp & Data Sci, Fuzhou 350116, Peoples R China
[2] Chinese Univ Hong Kong Shenzhen CUHKSZ, Sch Sci & Engn, Hong Kong 518172, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
3D object detection; stereo matching; binocular images; occluded object; autonomous driving;
D O I
10.1109/TCSVT.2024.3499327
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Camera-based stereo 3D object detection estimates 3D properties of objects with binocular images only, which is a cost-effective solution for autonomous driving. The state-of-the-art methods mainly improve the detection accuracy of general objects by designing ingenious stereo matching algorithms or complex pipeline modules. Moreover, additional fine-grained annotations, such as masks or LiDAR point clouds, are often introduced to deal with the occlusion problems, which brings in high manual costs for this task. To address the detection bottleneck caused by occlusion in a more cost-effective manner, we develop a novel stereo 3D object detection method named DSC3D, which achieves significant improvements for occluded objects without introducing additional supervision. Specifically, we first report the ambiguity in feature sampling, which refers to the presence of noisy features in the sampling for occluded objects. Then, we propose the Epipolar Constraint Deform-Attention (ECDA) module to address the unreliable left-right correspondence computation in stereo matching caused by occlusion, which reweights epipolar features by adaptively aggregating local neighbor information. Furthermore, to ensure that 3D property estimation is based on robust object features, we propose visible regions guided constraint to explicitly guide the offset learning for feature sampling. Extensive experiments conducted on the KITTI benchmark have demonstrated the proposed DSC3D outperforms the state-of-the-art camera-based methods.
引用
收藏
页码:2794 / 2805
页数:12
相关论文
共 59 条
[1]   Monocular and Binocular Interactions Oriented Deformable Convolutional Networks for Blind Quality Assessment of Stereoscopic Omnidirectional Images [J].
Chai, Xiongli ;
Shao, Feng ;
Jiang, Qiuping ;
Meng, Xiangchao ;
Ho, Yo-Sung .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (06) :3407-3421
[2]   Pyramid Stereo Matching Network [J].
Chang, Jia-Ren ;
Chen, Yong-Sheng .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5410-5418
[3]   Unambiguous Pyramid Cost Volumes Fusion for Stereo Matching [J].
Chen, Qibo ;
Ge, Baozhen ;
Quan, Jianing .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) :9223-9236
[4]   3D Object Proposals Using Stereo Imagery for Accurate Object Class Detection [J].
Chen, Xiaozhi ;
Kundu, Kaustav ;
Zhu, Yukun ;
Ma, Huimin ;
Fidler, Sanja ;
Urtasun, Raquel .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (05) :1259-1272
[5]  
Chen XZ, 2015, ADV NEUR IN, V28
[6]   DSGN plus plus : Exploiting Visual-Spatial Relation for Stereo-Based 3D Detectors [J].
Chen, Yilun ;
Huang, Shijia ;
Liu, Shu ;
Yu, Bei ;
Jia, Jiaya .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (04) :4416-4429
[7]   DSGN: Deep Stereo Geometry Network for 3D Object Detection [J].
Chen, Yilun ;
Liu, Shu ;
Shen, Xiaoyong ;
Jia, Jiaya .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :12533-12542
[8]   Deformable Convolutional Networks [J].
Dai, Jifeng ;
Qi, Haozhi ;
Xiong, Yuwen ;
Li, Yi ;
Zhang, Guodong ;
Hu, Han ;
Wei, Yichen .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773
[9]  
Faugeras O., 1993, REAL TIME CORRELATIO
[10]   ESGN: Efficient Stereo Geometry Network for Fast 3D Object Detection [J].
Gao, Aqi ;
Pang, Yanwei ;
Nie, Jing ;
Shao, Zhuang ;
Cao, Jiale ;
Guo, Yishun ;
Li, Xuelong .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) :2000-2009