Micro-expression spotting with multi-scale local transformer in long videos

被引:14
作者
Guo, Xupeng [1 ]
Zhang, Xiaobiao [1 ]
Li, Lei [2 ]
Xia, Zhaoqiang [1 ,3 ]
机构
[1] Northwestern Polytech Univ, Sch Elect & Informat, Xian 710129, Peoples R China
[2] Shandong Univ Finance & Econ, Sch Comp Sci & Technol, Jinan 250014, Peoples R China
[3] Northwestern Polytech Univ, Innovat Ctr NPU Chongqing, Chongqing 400000, Peoples R China
基金
中国国家自然科学基金;
关键词
Micro-expression spotting; Convolutional network; Local transformer;
D O I
10.1016/j.patrec.2023.03.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Micro-expression analysis by computer vision techniques has attracted much attention as it can reveal the human emotions automatically. Among the analysis tasks, the temporal spotting is the most challenging task for achieving expression-aware frames from long video sequences. Compared to the well studied recognition task, more researches need to be devoted to the spotting task for further improving the per-formance and benefiting the subsequent tasks. So, in this paper, we propose a convolutional transformer based deep model for micro-expression spotting in long video sequences. A 3D convolutional subnetwork is firstly employed to extract the visual features from the temporal frames in a fixed-size sliding win-dow of original video sequence. Then a multi-scale local transformer module is designed based on the visual features to model the correlation between frames in a local window. By leveraging the correlation information, the description of face movement becomes more representative for various-duration micro-expressions. Finally, the multi-head classifier and the corresponding estimator are jointly combined to predict the temporal position for spotting micro-expressions. The proposed method is evaluated on two publicly-available datasets, namely CAS(ME)2 and SAMM-LV, and achieves the promising performance of 0.2770 F1-score on SAMM-LV and 0.1373 F1-score on CAS(ME)2. The code is publicly available on GitHub ( https://github.com/xiazhaoqiang/MULT-MicroExpressionSpot ).(c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页码:146 / 152
页数:7
相关论文
共 28 条
[1]  
Albu Felix, 2008, 2008 Digest of Technical Papers - International Conference on Consumer Electronics (ICCE '08), P1
[2]   Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].
Carreira, Joao ;
Zisserman, Andrew .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733
[3]   Darwin, deception, and facial expression [J].
Ekman, P .
EMOTIONS INSIDE OUT: 130 YEARS AFTER DARWIN'S THE EXPRESSION OF THE EMOTIONS IN MAN AND ANIMALS, 2003, 1000 :205-221
[4]   CAS(ME)2: A Database for Spontaneous Macro-Expression and Micro-Expression Spotting and Recognition [J].
Qu, Fangbing ;
Wang, Su-Jing ;
Yan, Wen-Jing ;
Li, He ;
Wu, Shuhang ;
Fu, Xiaolan .
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2018, 9 (04) :424-436
[5]   TOOD: Task-aligned One-stage Object Detection [J].
Feng, Chengjian ;
Zhong, Yujie ;
Gao, Yu ;
Scott, Matthew R. ;
Huang, Weilin .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :3490-3499
[6]   Spotting Macro- and Micro-expression Intervals in Long Video Sequences [J].
He, Ying ;
Wang, Su-Jing ;
Li, Jingting ;
Yap, Moi Hoon .
2020 15TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2020), 2020, :742-748
[7]   Research on Micro-Expression Spotting Method Based on Optical Flow Features [J].
He Yuhong .
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, :4803-4807
[8]   Facial Micro-Expression Recognition using Spatiotemporal Local Binary Pattern with Integral Projection [J].
Huang, Xiaohua ;
Wang, Su-Jing ;
Zhao, Guoying ;
Pietikainen, Matti .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOP (ICCVW), 2015, :1-9
[9]  
Li J., 2019, 2019 14 IEEE INT C A, P1
[10]   SHALLOW OPTICAL FLOWTHREE-STREAM CNN FOR MACRO- AND MICRO-EXPRESSION SPOTTING FROM LONG VIDEOS [J].
Liong, Gen-Bing ;
See, John ;
Wong, Lai-Kuan .
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, :2643-2647