Driver attention prediction based on convolution and transformers

被引:30
作者
Gou, Chao [1 ,2 ]
Zhou, Yuchen [1 ,2 ]
Li, Dan [3 ]
机构
[1] Sun Yat Sen Univ, Shenzhen Campus, Shenzhen 518107, Peoples R China
[2] Sun Yat Sen Univ, Sch Intelligent Syst Engn, Guangzhou 510275, Peoples R China
[3] Sun Yat Sen Univ, Zhuhai Campus, Zhuhai 519082, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Attention prediction; Transformer; Autonomous driving; Human-machine augmented intelligence; MODEL;
D O I
10.1007/s11227-021-04151-2
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, studying how drivers allocate their attention while driving is critical in achieving human-like cognitive ability for autonomous vehicles. And it has been an active topic in the community of human-machine augmented intelligence for self-driving. However, existing state-of-the-art methods for driver attention prediction are mainly built upon convolutional neural network (CNN) with local receptive field which has a limitation to capture the long-range dependencies. In this work, we propose a novel Attention prediction method based on CNN and Transformer which is termed as ACT-Net. In particular, CNN and Transformer are combined as a block which is further stacked to form the deep model. Through this design, both local and long-range dependencies are captured that both are crucial for driver attention prediction. Exhaustive comparison experiments over other state-of-the-art techniques conducted on widely used dataset of BDD-A and private collected data on BDD-X validate the effectiveness of the proposed ACT-Net.
引用
收藏
页码:8268 / 8284
页数:17
相关论文
共 36 条
[11]   SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks [J].
Huang, Xun ;
Shen, Chengyao ;
Boix, Xavier ;
Zhao, Qi .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :262-270
[12]   A model of saliency-based visual attention for rapid scene analysis [J].
Itti, L ;
Koch, C ;
Niebur, E .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (11) :1254-1259
[13]   Textual Explanations for Self-Driving Vehicles [J].
Kim, Jinkyu ;
Rohrbach, Anna ;
Darrell, Trevor ;
Canny, John ;
Akata, Zeynep .
COMPUTER VISION - ECCV 2018, PT II, 2018, 11206 :577-593
[14]   Saliency Heat-Map as Visual Attention for Autonomous Driving Using Generative Adversarial Network (GAN) [J].
Lateef, Fahad ;
Kas, Mohamed ;
Ruichek, Yassine .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (06) :5360-5373
[15]   Methods for comparing scanpaths and saliency maps: strengths and weaknesses [J].
Le Meur, Olivier ;
Baccino, Thierry .
BEHAVIOR RESEARCH METHODS, 2013, 45 (01) :251-266
[16]   Improving Driver Gaze Prediction With Reinforced Attention [J].
Lv, Kai ;
Sheng, Hao ;
Xiong, Zhang ;
Li, Wei ;
Zheng, Liang .
IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 :4198-4207
[17]   SELECTIVE ATTENTION GATES VISUAL PROCESSING IN THE EXTRASTRIATE CORTEX [J].
MORAN, J ;
DESIMONE, R .
SCIENCE, 1985, 229 (4715) :782-784
[18]   A Reference Model for Driver Attention in Automation: Glance Behavior Changes During Lateral and Longitudinal Assistance [J].
Morando, Alberto ;
Victor, Trent ;
Dozza, Marco .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2019, 20 (08) :2999-3009
[19]   "Looking at the right stuff" - Guided semantic-gaze for autonomous driving [J].
Pal, Anwesan ;
Mondal, Sayan ;
Christensen, Henrik, I .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11880-11889
[20]   Predicting the Driver's Focus of Attention: The DR(eye)VE Project [J].
Palazzi, Andrea ;
Abati, Davide ;
Calderara, Simone ;
Solera, Francesco ;
Cucchiara, Rita .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (07) :1720-1733