Research on Heart Rate Detection from Facial Videos Based on an Attention Mechanism 3D Convolutional Neural Network

被引：0

作者：

Sun, Xiujuan ^{[1
]}

Su, Ying ^{[1
]}

Hou, Xiankai ^{[1
]}

Yuan, Xiaolan ^{[1
]}

Li, Hongxue ^{[1
]}

Wang, Chuanjiang ^{[1
]}

机构：

[1] Shandong Univ Sci & Technol, Coll Elect Engn & Automat, Qingdao 266590, Peoples R China

来源：

ELECTRONICS | 2025年 / 14卷 / 02期

关键词：

BiLSTM; attention mechanism; convolutional neural network; facial video; rPPG;

D O I：

10.3390/electronics14020269

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Remote photoplethysmography (rPPG) has attracted growing attention due to its non-contact nature. However, existing non-contact heart rate detection methods are often affected by noise from motion artifacts and changes in lighting, which can lead to a decrease in detection accuracy. To solve this problem, this paper initially employs manual extraction to precisely define the facial Region of Interest (ROI), expanding the facial area while avoiding rigid regions such as the eyes and mouth to minimize the impact of motion artifacts. Additionally, during the training phase, illumination normalization is employed on video frames with uneven lighting to mitigate noise caused by lighting fluctuations. Finally, this paper introduces a 3D convolutional neural network (CNN) method incorporating an attention mechanism for heart rate detection from facial videos. We optimize the traditional 3D-CNN to capture global features in spatiotemporal data more effectively. The SimAM attention mechanism is introduced to enable the model to precisely focus on and enhance facial ROI feature representations. Following the extraction of rPPG signals, a heart rate estimation network using a bidirectional long short-term memory (BiLSTM) model is employed to derive the heart rate from the signals. The method introduced here is experimentally validated on two publicly available datasets, UBFC-rPPG and PURE. The mean absolute errors were 0.24 bpm and 0.65 bpm, the root mean square errors were 0.63 bpm and 1.30 bpm, and the Pearson correlation coefficients reached 0.99, confirming the method's reliability. Comparisons of predicted signals with ground truth signals further validated its accuracy.

引用

页数：17

共 35 条

[1] Unsupervised skin tissue segmentation for remote photoplethysmography
Bobbia, Serge
Macwan, Richard
Benezeth, Yannick
Mansouri, Alamin
Dubois, Julien
[J]. PATTERN RECOGNITION LETTERS, 2019, 124 : 82 - 90
[2] Cao JZ, 2023, Arxiv, DOI arXiv:2106.06847
[3] DeepPhys: Video-Based Physiological Measurement Using Convolutional Attention Networks
Chen, Weixuan
McDuff, Daniel
[J]. COMPUTER VISION - ECCV 2018, PT II, 2018, 11206 : 356 - 373
[4] Video-Based Heart Rate Measurement: Recent Advances and Future Prospects
Chen, Xun
Cheng, Juan
Song, Rencheng
Liu, Yu
Ward, Rabab
Wang, Z. Jane
[J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2019, 68 (10) : 3600 - 3615
[5] INDEPENDENT COMPONENT ANALYSIS, A NEW CONCEPT
COMON, P
[J]. SIGNAL PROCESSING, 1994, 36 (03) : 287 - 314
[6] High heart rate: a cardiovascular risk factor?
Cook, Stephane
Togni, Mario
Schaub, Marcus C.
Wenaweser, Peter
Hess, Otto M.
[J]. EUROPEAN HEART JOURNAL, 2006, 27 (20) : 2387 - 2393
[7] Robust Pulse Rate From Chrominance-Based rPPG
de Haan, Gerard
Jeanne, Vincent
[J]. IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2013, 60 (10) : 2878 - 2886
[8] The Way to my Heart is through Contrastive Learning: Remote Photoplethysmography from Unlabelled Video
Gideon, John
Stent, Simon
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 3975 - 3984
[9] Framewise phoneme classification with bidirectional LSTM and other neural network architectures
Graves, A
Schmidhuber, J
[J]. NEURAL NETWORKS, 2005, 18 (5-6) : 602 - 610
[10] Availability and performance of face based non-contact methods for heart rate and oxygen saturation estimations: A systematic review
Gupta, Ankit
Ravelo-Garcia, Antonio G.
Dias, Fernando Morgado
[J]. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2022, 219

← 1 2 3 4 →