3D Object Detection Method with Image Semantic Feature Guidance and Cross-Modal Fusion of Point Cloud

被引：0

作者：

Li, Hui ^{[1
]}

Wang, Junyin ^{[1
]}

Cheng, Yuanzhi ^{[2
]}

Liu, Jian ^{[3
]}

Zhao, Guowei ^{[1
]}

Chen, Shuangmin ^{[1
]}

机构：

[1] School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao

[2] Faculty of Computing, Harbin Institute of Technology, Harbin

[3] College of Computer Science, Nankai University, Tianjin

来源：

Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics | 2024年 / 36卷 / 05期

关键词：

3D object detection; anchor-free; cross-modal; point cloud; semantic feature;

D O I：

10.3724/SP.J.1089.2024.19862

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Due to the complexity of scenes, the influence of object scale changes and occlusions etc., object detection still face many challenges. Cross-modal feature fusion of image and laser point cloud information can effectively improve the performance of 3D object detection, but the fusion effect and detection performance still need to be improved. Therefore, this paper first designs an image semantic feature learning network, which adopts a position and channel dual-branch self-attention parallel computing method, achieves global semantic enhancement, to reduce target misclassification. Secondly, a local semantic fusion module with image semantic feature guidance is proposed, which uses element-level data splicing to guide and fuse point cloud data with the local semantic features of the retrieved images, so as to better solve the problem of semantic alignment in cross-modal information fusion. A multi-scale re-fusion network is proposed, and the interaction module between the fusion features and LiDAR is designed to learn multi-scale connections in fusion features and re-fusion between features of different resolutions, so as to improve the detection performance. Finally, four task losses are adopted to perform anchor-free 3D multi-object detector. Comparing with other methods in KITTI and nuScenes datasets, the detection accuracy for 3D objects is 87.15%, and the experimental results show that the method in this paper outperforms the comparison methods and has better 3D detection performance. © 2024 Institute of Computing Technology. All rights reserved.

引用

页码：734 / 749

页数：15

共 50 条

[41] Object Detection Based on Fusion of Sparse Point Cloud and Image Information [J].

Xu, Xiaobin ;

Zhang, Lei ;

Yang, Jian ;

Cao, Chenfei ;

Tan, Zhiying ;

Luo, Minzhou .

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70

[42] Cross-modal pedestrian detection algorithm based on dual-branch feature fusion [J].

Chen, Guangqiu ;

Zhang, Tongsen ;

Duan, Jin ;

Huang, Dandan .

Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2025, 53 (03) :14-22

[43] Object tracking method based on joint global and local feature descriptor of 3D LIDAR point cloud [J].

Qian, Qishu ;

Hu, Yihua ;

Zhao, Nanxiang ;

Li, Minle ;

Shao, Fucai ;

Zhang, Xinyuan .

CHINESE OPTICS LETTERS, 2020, 18 (06)

[44] CMIGNet: Cross-Modal Inverse Guidance Network for RGB-Depth salient object detection [J].

Zhu, Hegui ;

Ni, Jia ;

Yang, Xi ;

Zhang, Libo .

PATTERN RECOGNITION, 2024, 155

[45] Object tracking method based on joint global and local feature descriptor of 3D LIDAR point cloud [J].

钱其姝 ;

胡以华 ;

赵楠翔 ;

李敏乐 ;

邵福才 ;

张鑫源 .

Chinese Optics Letters, 2020, 18 (06) :28-33

[46] Multi-Scale Keypoints Feature Fusion Network for 3D Object Detection from Point Clouds [J].

Zhang, Xu ;

Bai, Linjuan ;

Zhang, Zuyu ;

Li, Yan .

HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES, 2022, 12

[47] A 3D Point Cloud Feature Identification Method Based on Improved Point Feature Histogram Descriptor [J].

Wang, Chunxiao ;

Xiong, Xiaoqing ;

Zhang, Xiaoying ;

Liu, Lu ;

Tan, Wu ;

Liu, Xiaojuan ;

Yang, Houqun .

ELECTRONICS, 2023, 12 (17)

[48] 2D TO 3D LABEL PROPAGATION FOR OBJECT DETECTION IN POINT CLOUD [J].

Lertniphonphan, Kanokphan ;

Komorita, Satoshi ;

Tasaka, Kazuyuki ;

Yanagihara, Hiromasa .

2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW 2018), 2018,

[49] MC-Refine: Enhanced Cross-Modal 3-D Object Detection via Multistage Cross-Scale Fusion and Box Refinement [J].

Xu, Wencai ;

Hu, Jie ;

Tang, Yuxuan ;

Chen, Jiaji ;

Chen, Nan ;

Wang, Zhanbin .

IEEE SENSORS JOURNAL, 2025, 25 (01) :1784-1798

[50] 3D object detection based on point cloud in automatic driving scene [J].

Li, Hai-Sheng ;

Lu, Yan-Ling .

MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (05) :13029-13044

← 1 2 3 4 5 →