Image-based instance segmentation dense point cloud multimodal 3D object detection
被引:0
作者:
Yuxiang Xu
论文数: 0引用数: 0
h-index: 0
机构:
Anhui Polytechnic University,The school of Mechanical and Automotive EngineeringAnhui Polytechnic University,The school of Mechanical and Automotive Engineering
Yuxiang Xu
[1
]
Rongyun Zhang
论文数: 0引用数: 0
h-index: 0
机构:
Anhui Polytechnic University,The school of Mechanical and Automotive EngineeringAnhui Polytechnic University,The school of Mechanical and Automotive Engineering
Rongyun Zhang
[1
]
Peicheng Shi
论文数: 0引用数: 0
h-index: 0
机构:
Anhui Polytechnic University,Automotive New Technology Anhui Engineering and Technology Research CenterAnhui Polytechnic University,The school of Mechanical and Automotive Engineering
Peicheng Shi
[2
]
Bingzhou Zhou
论文数: 0引用数: 0
h-index: 0
机构:
University of Nottingham,The Faculty of EngineeringAnhui Polytechnic University,The school of Mechanical and Automotive Engineering
Bingzhou Zhou
[3
]
Hongwei Ou
论文数: 0引用数: 0
h-index: 0
机构:
Anhui Polytechnic University,The school of Mechanical and Automotive EngineeringAnhui Polytechnic University,The school of Mechanical and Automotive Engineering
Hongwei Ou
[1
]
Rongxiang Wang
论文数: 0引用数: 0
h-index: 0
机构:
Anhui Polytechnic University,The school of Mechanical and Automotive EngineeringAnhui Polytechnic University,The school of Mechanical and Automotive Engineering
Rongxiang Wang
[1
]
机构:
[1] Anhui Polytechnic University,The school of Mechanical and Automotive Engineering
[2] Anhui Polytechnic University,Automotive New Technology Anhui Engineering and Technology Research Center
[3] University of Nottingham,The Faculty of Engineering
3D object detection;
Autonomous driving;
Instance segmentation;
Multimodal fusion;
D O I:
10.1007/s10489-025-06661-5
中图分类号:
学科分类号:
摘要:
3D object detection in autonomous driving systems must confront the problem of inaccurate detection of distant and small objects caused by sparse point clouds in complex environments. Especially in complex traffic environments, single-modal detection methods have difficulty meeting the high accuracy requirements. To address this challenge, this paper proposes a novel multimodal 3D object detection algorithm based on densified point clouds via image instance segmentation. The image is first segmented by instance, and then a virtual point cloud is generated based on the instance results and point cloud projection. Moreover, the class scores of the instances are encoded as additional dimensions of the point cloud to enhance the semantic information. This paper introduces dynamic voxel geometry encoding, which adjusts the size and position of the voxels based on the motion changes in the target object. This adjustment enhances the detection of distant and small objects. In addition, this paper presents a new data augmentation technique to effectively improve the training efficiency and detection performance of the model. Extensive experimental verification shows that this model achieves a 6.2% greater mean average precision (mAP) on the KITTI dataset than does the classic multimodal detection method PointPainting; it performs particularly well in detecting pedestrians and cyclists. These results demonstrate the method’s effectiveness and practicality.