Two-Stream Spatial-Temporal Graph Convolutional Networks for Driver Drowsiness Detection

被引:34
|
作者
Bai, Jing [1 ,2 ]
Yu, Wentao [1 ,2 ]
Xiao, Zhu [3 ]
Havyarimana, Vincent [3 ,4 ]
Regan, Amelia C. [5 ]
Jiang, Hongbo [3 ]
Jiao, Licheng [1 ,2 ]
机构
[1] Xidian Univ, Sch Artificial Intelligence, Xian 710071, Peoples R China
[2] Xidian Univ, Key Lab Intelligent Percept & Image Understanding, Minist Educ, Xian 710071, Peoples R China
[3] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410082, Peoples R China
[4] Ecole Normale Super, Dept Appl Sci, Bujumbura 6983, Burundi
[5] Univ Calif Irvine, Dept Comp Sci & Inst Transportat Studies, Irvine, CA 92697 USA
基金
中国国家自然科学基金;
关键词
Feature extraction; Faces; Mouth; Brain modeling; Vehicles; Videos; Support vector machines; Driver drowsiness detection; facial landmark detection; graph convolution networks (GCNs); TIME FATIGUE DETECTION; SYSTEM; ALERTNESS; STATE; EEG;
D O I
10.1109/TCYB.2021.3110813
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Convolutional neural networks (CNNs) have achieved remarkable performance in driver drowsiness detection based on the extraction of deep features of drivers' faces. However, the performance of driver drowsiness detection methods decreases sharply when complications, such as illumination changes in the cab, occlusions and shadows on the driver's face, and variations in the driver's head pose, occur. In addition, current driver drowsiness detection methods are not capable of distinguishing between driver states, such as talking versus yawning or blinking versus closing eyes. Therefore, technical challenges remain in driver drowsiness detection. In this article, we propose a novel and robust two-stream spatial-temporal graph convolutional network (2s-STGCN) for driver drowsiness detection to solve the above-mentioned challenges. To take advantage of the spatial and temporal features of the input data, we use a facial landmark detection method to extract the driver's facial landmarks from real-time videos and then obtain the driver drowsiness detection result by 2s-STGCN. Unlike existing methods, our proposed method uses videos rather than consecutive video frames as processing units. This is the first effort to exploit these processing units in the field of driver drowsiness detection. Moreover, the two-stream framework not only models both the spatial and temporal features but also models both the first-order and second-order information simultaneously, thereby notably improving driver drowsiness detection. Extensive experiments have been performed on the yawn detection dataset (YawDD) and the National TsingHua University drowsy driver detection (NTHU-DDD) dataset. The experimental results validate the feasibility of the proposed method. This method achieves an average accuracy of 93.4% on the YawDD dataset and an average accuracy of 92.7% on the evaluation set of the NTHU-DDD dataset.
引用
收藏
页码:13821 / 13833
页数:13
相关论文
共 50 条
  • [21] Two-Stream Spatial-Temporal Auto-Encoder With Adversarial Training for Video Anomaly Detection
    Guo, Biao
    Liu, Mingrui
    He, Qian
    Jiang, Ming
    IEEE ACCESS, 2024, 12 : 125881 - 125889
  • [22] Spatial-temporal correlation graph convolutional networks for traffic forecasting
    Huang, Ru
    Chen, Zijian
    Zhai, Guangtao
    He, Jianhua
    Chu, Xiaoli
    IET INTELLIGENT TRANSPORT SYSTEMS, 2023, 17 (07) : 1380 - 1394
  • [23] Railway Delay Prediction with Spatial-Temporal Graph Convolutional Networks
    Heglund, Jacob S. W.
    Taleongpong, Panukorn
    Hu, Simon
    Tran, Huy T.
    2020 IEEE 23RD INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2020,
  • [24] Rat Grooming Behavior Detection with Two-stream Convolutional Networks
    Lee, Chien-Cheng
    Wei-Wei, Gao
    Lui, Ping -Wing
    2019 NINTH INTERNATIONAL CONFERENCE ON IMAGE PROCESSING THEORY, TOOLS AND APPLICATIONS (IPTA), 2019,
  • [25] Spatial-Temporal Graph Convolutional Networks for Sign Language Recognition
    de Amorim, Cleison Correia
    Macedo, David
    Zanchettin, Cleber
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: WORKSHOP AND SPECIAL SESSIONS, 2019, 11731 : 646 - 657
  • [26] Spatial-Temporal Synchronous Graph Convolutional Networks: A New Framework for Spatial-Temporal Network Data Forecasting
    Song, Chao
    Lin, Youfang
    Guo, Shengnan
    Wan, Huaiyu
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 914 - 921
  • [27] STA-GCN: two-stream graph convolutional network with spatial–temporal attention for hand gesture recognition
    Wei Zhang
    Zeyi Lin
    Jian Cheng
    Cuixia Ma
    Xiaoming Deng
    Hongan Wang
    The Visual Computer, 2020, 36 : 2433 - 2444
  • [28] Spatial-temporal interaction learning based two-stream network for action recognition
    Liu, Tianyu
    Ma, Yujun
    Yang, Wenhan
    Ji, Wanting
    Wang, Ruili
    Jiang, Ping
    INFORMATION SCIENCES, 2022, 606 : 864 - 876
  • [29] Skeleton action recognition using Two-Stream Adaptive Graph Convolutional Networks
    Lee, James
    Kang, Suk-ju
    2021 36TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC), 2021,
  • [30] Two-stream deep spatial-temporal auto-encoder for surveillance video abnormal event detection
    Li, Tong
    Chen, Xinyue
    Zhu, Fushun
    Zhang, Zhengyu
    Yan, Hua
    NEUROCOMPUTING, 2021, 439 (439) : 256 - 270