Boosting Facial Emotion Recognition by Using GANs to Augment Small Facial Expression Dataset

被引：1

作者：

Hung, Shih-Kai ^{[1
]}

Gan, John Q. ^{[1
]}

机构：

[1] Univ Essex, Sch Comp Sci & Elect Engn, Colchester, Essex, England

来源：

2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2022年

关键词：

facial expression recognition; generative adversarial network (GAN); deep learning; image data augmentation;

D O I：

10.1109/IJCNN55064.2022.9892096

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Facial expression recognition (FER) is an important research field for human-machine interfaces, and FER by deep learning from small training datasets is still challenging. This paper proposes a new generative adversarial network (GAN) model to transfer neutral face images to images with diverse facial expressions to deal with the problem of expressional data scarcity in deep learning for FER based on a small set of training samples. To mitigate distortions and overfitting that often happen in training GANs with small training datasets, a novel GAN architecture is proposed, which consists of a generator with two encoders and two decoders, two discriminators, and a feature extractor. Specifically, a feature map mechanism is proposed to discover regional feature differences between images in the source domain and target domain, which makes the proposed GAN architecture be able to not only generate desirable facial expression images but also maintain the original characters in input neutral face images. Experimental results show that, by using the proposed GAN to augment a training dataset of images with up to 7 facial expressions, the FER accuracy of several deep neural networks tested in the experiments can be significantly improved by over 10%.

引用

页数：8

共 23 条

[1]

[Anonymous], 2010, 2010 IEEE COMPUTER S, DOI [10. 1109/CVPRW.2010.5543262, DOI 10.1109/CVPRW.2010.5543262]

[2]

[Anonymous], 2007, Taiwanese facial expression image database

[3] Comparative Study on Subject Classification of Academic Videos using Noisy Transcripts [J].

Chang, Hau-Wen ;

Kim, Hung-sik ;

Li, Shuyang ;

Lee, Jeongkyu ;

Lee, Dongwon .

2010 IEEE FOURTH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2010), 2010, :67-72

[4]

Dino Hivi Ismat, 2019, 2019 International Conference on Advanced Science and Engineering (ICOASE), P70, DOI 10.1109/ICOASE.2019.8723728

[5]

Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672

[6] Image-to-Image Translation with Conditional Adversarial Networks [J].

Isola, Phillip ;

Zhu, Jun-Yan ;

Zhou, Tinghui ;

Efros, Alexei A. .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5967-5976

[7] Perceptual Losses for Real-Time Style Transfer and Super-Resolution [J].

Johnson, Justin ;

Alahi, Alexandre ;

Li Fei-Fei .

COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 :694-711

[8] Analyzing and Improving the Image Quality of StyleGAN [J].

Karras, Tero ;

Laine, Samuli ;

Aittala, Miika ;

Hellsten, Janne ;

Lehtinen, Jaakko ;

Aila, Timo .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :8107-8116

[9]

King DB, 2015, ACS SYM SER, V1214, P1, DOI 10.1021/bk-2015-1214.ch001

[10]

Krishna S.T., 2019, International Journal of Recent Technology and Engineering (IJRTE), V7, P427

← 1 2 3 →