Soft Spatial Attention-Based Multimodal Driver Action Recognition Using Deep Learning

被引:34
作者
Jegham, Imen [1 ]
Ben Khalifa, Anouar [2 ]
Alouani, Ihsen [3 ]
Mahjoub, Mohamed Ali [2 ]
机构
[1] Univ Sousse, Inst Super Informat & Tech Commun H Sousse, LATIS, Sousse 4011, Tunisia
[2] Univ Sousse, Ecole Natl Ingenieurs Sousse, LATIS, Sousse 4023, Tunisia
[3] Univ Polytech Hauts De France, IEMN DOAE, F-59300 Valenciennes, France
关键词
Vehicles; Sensors; Visualization; Computational modeling; Machine learning; Monitoring; Task analysis; Driver action recognition; kinect sensor; spatial soft attention; multimodal; deep learning; CLASSIFICATION; NETWORKS; KINECT;
D O I
10.1109/JSEN.2020.3019258
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Driver behaviors and decisions are crucial factors for on-road driving safety. With a precise driver behavior monitoring system, traffic accidents and injuries can be significantly reduced. However, understanding human behaviors in real-world driving settings is a challenging task because of the uncontrolled conditions including illumination variation, occlusion, and dynamic and cluttered background. In this paper, a Kinect sensor, which provides multimodal signals, is adopted as a driver monitoring sensor to recognize safe driving and common secondary most distracting in-vehicle actions. We propose a novel soft spatial attention-based network named the Depth-based Spatial Attention network (DSA), which adds a cognitive process to deep network by selectively focusing on the driver's silhouette and motion in the cluttered driving scene. In fact, at each time t, we introduce a new weighted RGB frame based on an attention model designed using a depth frame. The final classification accuracy is substantially enhanced compared to the state-of-the-art results with an achieved improvement of up to 27%.
引用
收藏
页码:1918 / 1925
页数:8
相关论文
共 47 条
  • [1] Baradel Fabien., 2018, Proc. Brit. Mach. Vis. Conf, P1
  • [2] Pedestrian detection using a moving camera: A novel framework for foreground detection
    Ben Khalifa, Anouar
    Alouani, Ihsen
    Mahjoub, Mohamed Ali
    Ben Amara, Najoua Essoukri
    [J]. COGNITIVE SYSTEMS RESEARCH, 2020, 60 : 77 - 96
  • [3] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
    Carreira, Joao
    Zisserman, Andrew
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4724 - 4733
  • [4] Cheng Yong., 2019, Agreement-Based Joint Training for Bidirectional Attention-Based Neural Machine Translation, P11
  • [5] A Survey on Activity Detection and Classification Using Wearable Sensors
    Cornacchia, Maria
    Ozcan, Koray
    Zheng, Yu
    Velipasalar, Senem
    [J]. IEEE SENSORS JOURNAL, 2017, 17 (02) : 386 - 403
  • [6] Donahue J, 2015, PROC CVPR IEEE, P2625, DOI 10.1109/CVPR.2015.7298878
  • [7] Learning Spatiotemporal Features with 3D Convolutional Networks
    Du Tran
    Bourdev, Lubomir
    Fergus, Rob
    Torresani, Lorenzo
    Paluri, Manohar
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4489 - 4497
  • [8] Recurrent Spatial-Temporal Attention Network for Action Recognition in Videos
    Du, Wenbin
    Wang, Yali
    Qiao, Yu
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (03) : 1347 - 1360
  • [9] Girdhar, 2017, ADV NEURAL INFORM PR, P34
  • [10] Glorot X., 2010, Proceedings of the thirteenth international conference on artificial intelligence and statistics, P249, DOI DOI 10.1109/LGRS.2016.2565705