Multi-modal object detection via transformer network

被引：2

作者：

Liu, Wenbing ^{[1
,2
]}

Wang, Haibo ^{[1
,2
]}

Gao, Quanxue ^{[1
,3
]}

Zhu, Zhaorui ^{[1
]}

机构：

[1] Xidian Univ, Sch Telecommun Engn, Xian, Shaanxi, Peoples R China

[2] Sci & Technol Electroopt Control Lab, Xian, Henan, Peoples R China

[3] Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China

来源：

IET IMAGE PROCESSING | 2023年 / 17卷 / 12期

关键词：

image representations; object detection;

D O I：

10.1049/ipr2.12884

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

According to the fact that single-modal data usually contain limited information, a great deal of effort has been devoted to making use of the complementary information contained in the multi-modal data on various patterns. Thus, this paper is concerned with an object detection method that can fully utilize multi-modal data. First, the method introduces the transformer mechanism to realize the fusion of intra-modal and inter-modal features of different modal data. The aim is to take advantage of the complementarity of data between modalities, which helps to improve the performance of multi-modal object detection. Second, a contrastive loss suitable for contrastive learning is applied. This enables the authors to effectively utilize label information. Extensive experiments are conducted on multiple object detection datasets to demonstrate the effectiveness of our proposed method.

引用

页码：3541 / 3550

页数：10

共 50 条

[41] Exploiting Multi-Modal Synergies for Enhancing 3D Multi-Object Tracking
Xu, Xinglong
Ren, Weihong
Chen, Xi'ai
Fan, Huijie
Han, Zhi
Liu, Honghai
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (10): : 8643 - 8650
[42] TransMIN: Transformer-Guided Multi-Interaction Network for Remote Sensing Object Detection
Xu, Guangming
Song, Tiecheng
Sun, Xia
Gao, Chenqiang
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
[43] TransMIN: Transformer-Guided Multi-Interaction Network for Remote Sensing Object Detection
Xu, Guangming
Song, Tiecheng
Sun, Xia
Gao, Chenqiang
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
[44] Multi-modal Data Analysis and Fusion for Robust Object Detection in 2D/3D Sensing
Schierl, Jonathan
Graehling, Quinn
Aspiras, Theus
Asari, Vijay
Van Rynbach, Andre
Rabb, Dave
2020 IEEE APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP (AIPR): TRUSTED COMPUTING, PRIVACY, AND SECURING MULTIMEDIA, 2020,
[45] GraphAlign plus plus : An Accurate Feature Alignment by Graph Matching for Multi-Modal 3D Object Detection
Song, Ziying
Jia, Caiyan
Yang, Lei
Wei, Haiyue
Liu, Lin
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) : 2619 - 2632
[46] GEM: Glare or Gloom, I Can Still See You - End-to-End Multi-Modal Object Detection
Mazhar, Osama
Babuska, Robert
Kober, Jens
IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (04): : 6321 - 6328
[47] EPNet plus plus : Cascade Bi-Directional Fusion for Multi-Modal 3D Object Detection
Liu, Zhe
Huang, Tengteng
Li, Bingling
Chen, Xiwu
Wang, Xi
Bai, Xiang
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (07) : 8324 - 8341
[48] Bridging the View Disparity Between Radar and Camera Features for Multi-Modal Fusion 3D Object Detection
Zhou, Taohua
Chen, Junjie
Shi, Yining
Jiang, Kun
Yang, Mengmeng
Yang, Diange
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (02): : 1523 - 1535
[49] Multi-Modal and Multi-Scale Fusion 3D Object Detection of 4D Radar and LiDAR for Autonomous Driving
Wang, Li
Zhang, Xinyu
Li, Jun
Xv, Baowei
Fu, Rong
Chen, Haifeng
Yang, Lei
Jin, Dafeng
Zhao, Lijun
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (05) : 5628 - 5641
[50] Object Detection Algorithm Based on CNN-Transformer Dual Modal Feature Fusion
Yang Chen
Hou Zhiqiang
Li Xinyue
Ma Sugang
Yang Xiaobao
ACTA PHOTONICA SINICA, 2024, 53 (03)

← 1 2 3 4 5 →