CDAF3D: Cross-Dimensional Attention Fusion for Indoor 3D Object Detection

被引:0
|
作者
Wang, Shilin [1 ]
Huang, Hai [1 ]
Zhu, Yueyan [1 ]
Tang, Zhenqi [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Informat & Commun Engn, Beijing 100876, Peoples R China
基金
国家重点研发计划;
关键词
Indoor 3D Object Detection; Fusion Features; Point Cloud;
D O I
10.1007/978-981-97-8493-6_12
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D object detection is a crucial task in computer vision and autonomous systems, which is widely utilized in robotics, autonomous driving, and augmented reality. With the advancement of input devices, researchers propose to use multimodal information to improve the detection accuracy. However, integrating 2D and 3D features effectively to harness their complementary nature for detection tasks is still a challenge. In this paper, we note that the complementary nature of geometric and visual texture information can effectively strengthen feature fusion, which plays a key role in detection. To this end, we propose the Cross-Dimensional Attention Fusion-based indoor 3D object detection method (CDAF3D). This method dynamically learns geometric information with corresponding 2D image texture details through a cross-dimensional attention mechanism, enabling the model to capture and integrate spatial and textural information effectively. Additionally, due to the nature of 3D object detection, where intersecting entities with different specific labels are unrealistic, we further propose Preventive 3D Intersect Loss (P3DIL). This loss enhances detection accuracy by addressing intersections between objects of different labels. We evaluate the proposed CDAF3D on the SUN RGB-D and Scannet v2 datasets. Our results achieve 78.2 mAP@0.25 and 66.5 mAP@0.50 on ScanNetV2 and 70.3 mAP@0.25 and 54.1 mAP@0.50 on SUN RGB-D. The proposed CDAF3D outperforms all the multi-sensor-based methods with 3D IoU thresholds of 0.25 and 0.5.
引用
收藏
页码:165 / 177
页数:13
相关论文
共 50 条
  • [41] Characteristics of Earthquake Cycles: A Cross-Dimensional Comparison of 0D to 3D Numerical Models
    Li, Meng
    Pranger, Casper
    van Dinther, Ylona
    JOURNAL OF GEOPHYSICAL RESEARCH-SOLID EARTH, 2022, 127 (08)
  • [42] Towards Raw Sensor Fusion in 3D Object Detection
    Rovid, Andras
    Remeli, Viktor
    2019 IEEE 17TH WORLD SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS (SAMI 2019), 2019, : 293 - 298
  • [43] TR3D: TOWARDS REAL-TIME INDOOR 3D OBJECT DETECTION
    Rukhovich, Danila
    Vorontsova, Anna
    Konushin, Anton
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 281 - 285
  • [44] Cross-dimensional adaptivity research on a 3D earth observation data cube model
    Yu, Jinsongdi
    Cui, Zhanying
    Baumann, Peter
    Tong, Ruiju
    Wei, Dandan
    Luo, Yuan
    OPEN GEOSCIENCES, 2025, 17 (01):
  • [45] SOFW: A Synergistic Optimization Framework for Indoor 3D Object Detection
    Dai, Kun
    Jiang, Zhiqiang
    Xie, Tao
    Wang, Ke
    Liu, Dedong
    Fan, Zhendong
    Li, Ruifeng
    Zhao, Lijun
    Omar, Mohamed
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 637 - 651
  • [46] Spatial and Semantic Information Enhancement for Indoor 3D Object Detection
    Chen, Chunmei
    Liang, Zhiqiang
    Liu, Haitao
    Liu, Xin
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2023, 20 (05) : 831 - 839
  • [47] Real-Time 3D Visual Perception by Cross-Dimensional Refined Learning
    Hong, Ziyang
    Yue, C. Patrick
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 10326 - 10338
  • [48] 2D3D-DescNet: Jointly Learning 2D and 3D Local Feature Descriptors for Cross-Dimensional Matching
    Chen, Shuting
    Su, Yanfei
    Lai, Baiqi
    Cai, Luwei
    Hong, Chengxi
    Li, Li
    Qiu, Xiuliang
    Jia, Hong
    Liu, Weiquan
    REMOTE SENSING, 2024, 16 (13)
  • [49] 3D-MAN: 3D Multi-frame Attention Network for Object Detection
    Yang, Zetong
    Zhou, Yin
    Chen, Zhifeng
    Ngiam, Jiquan
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 1863 - 1872
  • [50] CramNet: Camera-Radar Fusion with Ray-Constrained Cross-Attention for Robust 3D Object Detection
    Hwang, Jyh-Jing
    Kretzschmar, Henrik
    Manela, Joshua
    Rafferty, Sean
    Armstrong-Crews, Nicholas
    Chen, Tiffany
    Anguelov, Dragomir
    COMPUTER VISION, ECCV 2022, PT XXXVIII, 2022, 13698 : 388 - 405