Multimodal Fusion-Based Semantic Transmission for Road Object Detection

被引:0
|
作者
Zhu Z. [1 ]
Wei Z. [2 ]
Zhang R. [3 ]
Yang L. [1 ]
机构
[1] Intelligent Transportation Thrust, The Hong Kong University of Science and Technology(Guangzhou), Guangzhou
[2] Shanghai Research Institute for Intelligent Autonomous Systems, Tongji University, Shanghai
[3] School of Software Engineering, Tongji University, Shanghai
来源
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence | 2023年 / 36卷 / 11期
基金
中国国家自然科学基金;
关键词
Heterogeneous Data; Multimodal Fusion; Road Object Detection; Self-Attention Mechanism; Semantic Communication;
D O I
10.16451/j.cnki.issn1003-6059.202311004
中图分类号
学科分类号
摘要
In extreme scenarios with long-tail effects, collaborative perception involving multiple vehicles and sensors can provide effective sensory information for vehicles. However, the differentiation in heterogeneous data, coupled with bandwidth constraints and diverse data formats, makes it challenging for vehicles to achieve unified and efficient scheduling in processing. To organically integrate multi-sensor information among different vehicles under limited communication bandwidth, a semantic communication framework for multimodal fusion object detection based on Transformer is proposed in this paper. Unlike traditional data transmission solutions, self-attention mechanisms are utilized in the proposed framework to fuse data from different modalities, focusing on exploring the semantic correlation and dependencies among modal data. It helps vehicles transmit information and collaborate under limited communication resources, thereby enhancing their understanding of complex road conditions. The experimental results on Teledyne FLIR Free ADAS Thermal dataset show that the proposed model performs well in multimodal object detection semantic communication tasks with accuracy of object detection significantly improved and transmission costs reduced by half. © 2023 Journal of Pattern Recognition and Artificial Intelligence. All rights reserved.
引用
收藏
页码:1009 / 1018
页数:9
相关论文
共 41 条
  • [1] VAHDAT-NEJAD H, RAMAZANI A, MOHAMMADI T, Et al., A Survey on Context-Aware Vehicular Network Applications, Vehicular Communications, 3, pp. 43-57, (2016)
  • [2] WEI Z W, LI B, ZHANG R Q, Et al., Many-to-Many Task Offloading in Vehicular Fog Computing: A Multi-agent Deep Reinforcement Learning Approach, IEEE Transactions on Mobile Computing, (2023)
  • [3] CHO H, SEO Y W, KUMAR B V K V, Et al., A Multi-sensor Fusion System for Moving Object Detection and Tracking in Urban Driving Environments, Proc of the IEEE International Conference on Robotics and Automation, pp. 1836-1843, (2014)
  • [4] LI B, ZHANG T L, XIA T., Vehicle Detection from 3D Lidar Using Fully Convolutional Network
  • [5] DUMITRASCU B, FILIPESCU A, PETREA G, Et al., Laser-Based Obstacle Avoidance Algorithm for Four Driving/ Steering Wheels Autonomous Vehicle, Proc of the 17th International Conference on System Theory, Control and Computing, pp. 187-192, (2013)
  • [6] CHEN K H, TSAI W H., Vision-Based Obstacle Detection and Avoidance for Autonomous Land Vehicle Navigation in Outdoor Roads, Automation in Construction, 10, 1, pp. 1-25, (2000)
  • [7] CALCROFT M, KHAN A., LiDAR-Based Obstacle Detection and Avoidance for Autonomous Vehicles Using Raspberry Pi 3B, Proc of the 13th International Conference on Control, pp. 24-29, (2022)
  • [8] JOHN V, MITA S., RVNet: Deep Sensor Fusion of Monocular Camera and Radar for Image-Based Obstacle Detection in Challenging Environments, Proc of the 9th Pacific-Rim Symposium on Image and Video Technology, pp. 351-364, (2019)
  • [9] KUMAR A D, KARTHIKA R, SOMAN K P., Stereo Camera and LI-DAR Sensor Fusion-Based Collision Warning System for Autonomous Vehicles, Advances in Computational Intelligence Techniques, pp. 239-252, (2020)
  • [10] ZHANG F H, CLARKE D, KNOLL A., Vehicle Detection Based on LiDAR and Camera Fusion, Proc of the 17th International IEEE Conference on Intelligent Transportation Systems, pp. 1620-1625, (2014)