A Lightweight and Efficient Distracted Driver Detection Model Fusing Convolutional Neural Network and Vision Transformer

被引：1

作者：

Li, Zhao ^{[1
]}

Zhao, Xia ^{[2
]}

Wu, Fuwei ^{[1
]}

Chen, Dan ^{[3
]}

Wang, Chang ^{[1
]}

机构：

[1] Changan Univ, Sch Automobile, Xian 710061, Peoples R China

[2] Jiangsu Univ, Automot Engn Res Inst, Zhenjiang 212013, Peoples R China

[3] Changan Univ, Sch Elect & Control Engn, Xian 710061, Peoples R China

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2024年

关键词：

Feature extraction; Vehicles; Transformers; Convolution; Computational modeling; Accuracy; Convolutional neural networks; Traffic safety; distracted driver recognition; convolutional neural network; vision transformer; lightweight model; multi-scale feature; RECOGNITION; FRAMEWORK;

D O I：

10.1109/TITS.2024.3447041

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

Identifying distracted drivers is crucial for enhancing driving safety and advancing intelligent driver assistance systems. Recently, researchers have applied Convolutional Neural Network (CNN) and Vision Transformer (ViT) models for driver state decision. However, both models often suffer from several issues such as numerous parameters and low detection efficiency. To address these challenges, this study proposes the Convolution Vision Transformer (CoViT) model for distracted driver identification, leveraging techniques such as Low Complexity Attention Mechanism (LCAM), Multi-scale Dilation Convolution (MSDC), and Depth Separable Convolution (DSC). Moreover, the CoViT model features a typical "pyramid" structure, enabling effective feature extraction across different scales. Subsequently, the proposed system is trained and evaluated using the publicly available driving behavior datasets SFD2 and 100-Driver, as well as real-world road experiments. Experimental results show that the CoViT model yields high recognition performance, with mean Accuracy (mAcc) scores of 95.17%, 97.89%, and 93.54% on the recorded dataset, SFD2 dataset, and 100-Driver dataset, respectively. These scores surpass those obtained by similar lightweight models. Furthermore, ablation experiments reveal that deep and dilated convolution significantly enhance model performance. In addition, the CoViT model demonstrates its applicability to real-time driving behavior detection tasks, with a parametric count of just 1.24M - a reduction of 2.67M compared to MobileNetV3 - and an online inference Frames Per Second (FPS) of 159.13.

引用

页码：19962 / 19978

页数：17

共 66 条

[1] Distracted driver classification using deep learning
Alotaibi, Munif
Alotaibi, Bandar
[J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2020, 14 (03) : 617 - 624
[2] Do driver monitoring technologies improve the driving behaviour of distracted drivers? A simulation study to assess the impact of an auditory driver distraction warning device on driving performance
Bassani, M.
Catani, L.
Hazoor, A.
Hoxha, A.
Lioi, A.
Portera, A.
Tefa, L.
[J]. TRANSPORTATION RESEARCH PART F-TRAFFIC PSYCHOLOGY AND BEHAVIOUR, 2023, 95 : 239 - 250
[3] VNAGT: Variational Non-Autoregressive Graph Transformer Network for Multi-Agent Trajectory Prediction
Chen, Xiaobo
Zhang, Huanjia
Hu, Yu
Liang, Jun
Wang, Hai
[J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (10) : 12540 - 12552
[4] Mobile-Former: Bridging MobileNet and Transformer
Chen, Yinpeng
Dai, Xiyang
Chen, Dongdong
Liu, Mengchen
Dong, Xiaoyi
Yuan, Lu
Liu, Zicheng
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5260 - 5269
[5] EdgeViT: Efficient Visual Modeling for Edge Computing
Chen, Zekai
Zhong, Fangtian
Luo, Qi
Zhang, Xiao
Zheng, Yanwei
[J]. WIRELESS ALGORITHMS, SYSTEMS, AND APPLICATIONS, PT III, 2022, 13473 : 393 - 405
[6] A multi-feature fusion algorithm for driver fatigue detection based on a lightweight convolutional neural network
Cheng, Wangfeng
Wang, Xuanyao
Mao, Bangguo
[J]. VISUAL COMPUTER, 2024, 40 (04) : 2419 - 2441
[7] Real-time detection method of driver fatigue state based on deep learning of face video
Cui, Zhe
Sun, Hong-Mei
Yin, Ruo-Nan
Gao, Li
Sun, Hai-Bin
Jia, Rui-Sheng
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (17) : 25495 - 25515
[8] Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
[9] Real-time detection of distracted driving based on deep learning
Duy Tran
Ha Manh Do
Sheng, Weihua
Bai, He
Chowdhary, Girish
[J]. IET INTELLIGENT TRANSPORT SYSTEMS, 2018, 12 (10) : 1210 - 1219
[10] Fang Z., 2022, P 6 CAA INT C VEH CO, P1

← 1 2 3 4 5 6 7 →