Visual Attention Estimation Algorithm and Dynamic Neural Network Based Object Detection for Intelligent Vehicles

被引:0
作者
Zhao, Baixuan [1 ]
Hu, Chuan [1 ]
Zhu, Wangwang [1 ]
Hu, Shuang [1 ]
Zhang, Xi [1 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Mech Engn, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
Visualization; Object detection; Vehicle dynamics; Estimation; Neural networks; Intelligent vehicles; Heuristic algorithms; Dynamic neural network; intelligent vehicle; object detection; uneven downsampling; visual attention estimation;
D O I
10.1109/JSEN.2024.3387281
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Vision-based object detection for intelligent vehicles differs from regular object detection tasks. It necessitates an attention region where objects have a critical impact on the vehicle's driving behavior. This region is influenced by many factors, especially the ego-vehicle's driving state and the surrounding traffic participants. Existing approaches uniformly process the entire image, leading to significant computation redundancy. The objective of this study is to accurately and efficiently estimate this region and to enhance detection capability within it, thereby improving the detection performance of critical objects. Initially, an attention region estimation method is proposed. This method conceptualizes the estimation of the attention region as a mass center computation, taking into account the ego-vehicle's driving state and the surrounding traffic participants. Subsequently, an object detection method based on dynamic neural network is proposed. This method adaptively allocates computational resources according to the attention region. It employs an uneven downsampling method during preprocessing and introduces a dynamic block into the network architecture, which is designed to optimize feature representation within the attention region, thereby enhancing detection performance. To verify the effectiveness of the proposed method, we established a real-vehicle dataset, in which the critical objects are annotated by the drivers. Compared to the baseline model, the mean average precision (mAP) of critical objects has increased by 3.6%, while the average computation has decreased by 23.6%. The proposed method offers an efficient and accurate approach to object detection for intelligent vehicles, with the potential to enhance driving safety and reduce computational demands.
引用
收藏
页码:18535 / 18545
页数:11
相关论文
共 48 条
[1]   DR(eye) VE: a Dataset for Attention-Based Tasks with Applications to Autonomous and Assisted Driving [J].
Alletto, Stefano ;
Palazzi, Andrea ;
Solera, Francesco ;
Calderara, Simone ;
Cucchiara, Rita .
PROCEEDINGS OF 29TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, (CVPRW 2016), 2016, :54-60
[2]  
Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, 10.48550/arXiv.2004.10934]
[3]   M3D-RPN: Monocular 3D Region Proposal Network for Object Detection [J].
Brazil, Garrick ;
Liu, Xiaoming .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9286-9295
[4]  
Caesar H, 2020, PROC CVPR IEEE, P11618, DOI 10.1109/CVPR42600.2020.01164
[5]  
Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
[6]   How Do Drivers Allocate Their Potential Attention? Driving Fixation Prediction via Convolutional Neural Networks [J].
Deng, Tao ;
Yan, Hongmei ;
Qin, Long ;
Thuyen Ngo ;
Manjunath, B. S. .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2020, 21 (05) :2146-2154
[7]  
Dou SH, 2023, Arxiv, DOI arXiv:2312.09979
[8]   Spatially Adaptive Computation Time for Residual Networks [J].
Figurnov, Michael ;
Collins, Maxwell D. ;
Zhu, Yukun ;
Zhang, Li ;
Huang, Jonathan ;
Vetrov, Dmitry ;
Salakhutdinov, Ruslan .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1790-1799
[9]   MetaBEV: Solving Sensor Failures for 3D Detection and Map Segmentation [J].
Ge, Chongjian ;
Chen, Junsong ;
Xie, Enze ;
Wang, Zhongdao ;
Hong, Lanqing ;
Lu, Huchuan ;
Li, Zhenguo ;
Luo, Ping .
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, :8687-8697
[10]  
Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074