Research on efficient feature extraction: Improving YOLOv5 backbone for facial expression detection in live streaming scenes

被引：10

作者：

Li, Zongwei ^{[1
]}

Song, Jia ^{[1
]}

Qiao, Kai ^{[1
]}

Li, Chenghai ^{[2
]}

Zhang, Yanhui ^{[3
]}

Li, Zhenyu ^{[1
]}

机构：

[1] Shanghai Inst Technol, Sch Econ & Management, Shanghai, Peoples R China

[2] Anhui Univ Technol, Sch Management Sci & Engn, Maanshan, Peoples R China

[3] East China Univ Sci & Technol, Business Sch, Shanghai, Peoples R China

来源：

FRONTIERS IN COMPUTATIONAL NEUROSCIENCE | 2022年 / 16卷

基金：

中国国家自然科学基金;

关键词：

model optimization; object detection; attention mechanism; cascade classifier; live streaming;

D O I：

10.3389/fncom.2022.980063

中图分类号：

Q [生物科学];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Facial expressions, whether simple or complex, convey pheromones that can affect others. Plentiful sensory input delivered by marketing anchors' facial expressions to audiences can stimulate consumers' identification and influence decision-making, especially in live streaming media marketing. This paper proposes an efficient feature extraction network based on the YOLOv5 model for detecting anchors' facial expressions. First, a two-step cascade classifier and recycler is established to filter invalid video frames to generate a facial expression dataset of anchors. Second, GhostNet and coordinate attention are fused in YOLOv5 to eliminate latency and improve accuracy. YOLOv5 modified with the proposed efficient feature extraction structure outperforms the original YOLOv5 on our self-built dataset in both speed and accuracy.

引用

页数：14

共 35 条

[1] Benchmark Analysis of Representative Deep Neural Network Architectures
Bianco, Simone
Cadene, Remi
Celona, Luigi
Napoletano, Paolo
[J]. IEEE ACCESS, 2018, 6 : 64270 - 64277
[2] Bochkovskiy A, 2020, Arxiv, DOI arXiv:2004.10934
[3] Rosetta: Large Scale System for Text Detection and Recognition in Images
Borisyuk, Fedor
Gordo, Albert
Sivakumar, Viswanath
[J]. KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 71 - 79
[4] Hybrid Graphene-WS2 Mach-Zehnder modulator on passive silicon waveguide
Wu, ChengHan
Brems, Steven
Asselberghs, Inge
Huyghebaert, Cedric
Sorianello, Vito
Romagnoli, Marco
Van Campenhout, Joris
Van Thourhout, Dries
Pantouvaki, Marianna
[J]. 2021 CONFERENCE ON LASERS AND ELECTRO-OPTICS EUROPE & EUROPEAN QUANTUM ELECTRONICS CONFERENCE (CLEO/EUROPE-EQEC), 2021,
[5] Giannopoulos P., 2018, ADV HYBRIDIZATION IN, P1
[6] Rich feature hierarchies for accurate object detection and semantic segmentation
Girshick, Ross
Donahue, Jeff
Darrell, Trevor
Malik, Jitendra
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 580 - 587
[7] GhostNet: More Features from Cheap Operations
Han, Kai
Wang, Yunhe
Tian, Qi
Guo, Jianyuan
Xu, Chunjing
Xu, Chang
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 1577 - 1586
[8] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[9] Coordinate Attention for Efficient Mobile Network Design
Hou, Qibin
Zhou, Daquan
Feng, Jiashi
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 13708 - 13717
[10] Salient feature and reliable classifier selection for facial expression classification
Kyperountas, Marios
Tefas, Anastasios
Pitas, Ioannis
[J]. PATTERN RECOGNITION, 2010, 43 (03) : 972 - 986

← 1 2 3 4 →