Driver attention prediction based on convolution and transformers

被引:30
作者
Gou, Chao [1 ,2 ]
Zhou, Yuchen [1 ,2 ]
Li, Dan [3 ]
机构
[1] Sun Yat Sen Univ, Shenzhen Campus, Shenzhen 518107, Peoples R China
[2] Sun Yat Sen Univ, Sch Intelligent Syst Engn, Guangzhou 510275, Peoples R China
[3] Sun Yat Sen Univ, Zhuhai Campus, Zhuhai 519082, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Attention prediction; Transformer; Autonomous driving; Human-machine augmented intelligence; MODEL;
D O I
10.1007/s11227-021-04151-2
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, studying how drivers allocate their attention while driving is critical in achieving human-like cognitive ability for autonomous vehicles. And it has been an active topic in the community of human-machine augmented intelligence for self-driving. However, existing state-of-the-art methods for driver attention prediction are mainly built upon convolutional neural network (CNN) with local receptive field which has a limitation to capture the long-range dependencies. In this work, we propose a novel Attention prediction method based on CNN and Transformer which is termed as ACT-Net. In particular, CNN and Transformer are combined as a block which is further stacked to form the deep model. Through this design, both local and long-range dependencies are captured that both are crucial for driver attention prediction. Exhaustive comparison experiments over other state-of-the-art techniques conducted on widely used dataset of BDD-A and private collected data on BDD-X validate the effectiveness of the proposed ACT-Net.
引用
收藏
页码:8268 / 8284
页数:17
相关论文
共 36 条
[1]  
Alaparthi S, 2020, ABS200701127 ARXIV
[2]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[3]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[4]   How Do Drivers Allocate Their Potential Attention? Driving Fixation Prediction via Convolutional Neural Networks [J].
Deng, Tao ;
Yan, Hongmei ;
Qin, Long ;
Thuyen Ngo ;
Manjunath, B. S. .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2020, 21 (05) :2146-2154
[5]   Learning to Boost Bottom-Up Fixation Prediction in Driving Environments via Random Forest [J].
Deng, Tao ;
Yan, Hongmei ;
Li, Yong-Jie .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2018, 19 (09) :3059-3067
[6]  
Dosovitskiy A, 2020, ARXIV
[7]  
Fang J, 2021, IEEE T INTELL TRANSP, P1
[8]  
Fang JW, 2019, IEEE INT C INTELL TR, P4303, DOI 10.1109/ITSC.2019.8917218
[9]  
Han K, 2021, ABS210300112 ARXIV
[10]  
Harel J, 2006, ADV NEURAL INFORM PR, P545