Face photo-sketch synthesis via full-scale identity supervision

被引:16
作者
Cao, Bing [1 ]
Wang, Nannan [2 ]
Li, Jie [3 ]
Hu, Qinghua [1 ]
Gao, Xinbo [4 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300350, Peoples R China
[2] Xidian Univ, Sch Telecommun Engn, State Key Lab Integrated Serv Networks, Xian 710071, Peoples R China
[3] Xidian Univ, Sch Elect Engn, Video & Image Proc Syst Lab, Xian 710071, Peoples R China
[4] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Image Cognit, Chongqing 400065, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Face photo-sketch synthesis; Identity supervision; Cross-domain translation; Intra-domain adaptation;
D O I
10.1016/j.patcog.2021.108446
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Face photo-sketch synthesis refers transforming a face image between photo domain and sketch domain. It plays a crucial role in law enforcement and digital entertainment. A great deal of effort s have been devoted on face photo-sketch synthesis. However, limited by the weak identity supervision, existing methods mostly yield indistinct details or great deformation, resulting in poor perceptual appearance or low recognition accuracy. In the past several years, face identification achieved great progress, which represents the face images much more precisely than before. Considering the face image translation is also a type of face image re-representation, we attempt to introduce face recognition models to improve the synthesis performance. First, we applied existing synthesis models to augment the training set. Then, we proposed a full-scale identity supervision method to reduce redundant information introduced by these pseudo samples and take the valid information to enhance the intra-class variations. The proposed framework consists of two sub-networks: cross-domain translation (CT) network and intra-domain adaptation (IA) network. The CT network translates the input image from source domain to latent image of target domain, which overcomes the great gap between two domains with less structural deformation. The IA network adapts the perceptual appearance of latent image to target image by adversarial learning. Experimental results on CUHK Face Sketch Database and CUHK Face Sketch FERET Database demonstrate the proposed method preserved best perceptual appearance and more distinct details with less deformation. (c) 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页数:11
相关论文
共 41 条
  • [1] [Anonymous], 2018, IEEE T NEUR NET LEAR
  • [2] Cao B, 2018, AAAI CONF ARTIF INTE, P6682
  • [3] Chechik G, 2010, J MACH LEARN RES, V11, P1109
  • [4] StyleBank: An Explicit Representation for Neural Image Style Transfer
    Chen, Dongdong
    Yuan, Lu
    Liao, Jing
    Yu, Nenghai
    Hua, Gang
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2770 - 2779
  • [5] Chen J, 2009, PROC CVPR IEEE, P156, DOI 10.1109/CVPRW.2009.5206832
  • [6] Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672
  • [7] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [8] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [9] Huang Y., 2018, arXiv preprint arXiv:1804.05902
  • [10] Heterogeneous Face Recognition by Margin-Based Cross-Modality Metric Learning
    Huo, Jing
    Gao, Yang
    Shi, Yinghuan
    Yang, Wanqi
    Yin, Hujun
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2018, 48 (06) : 1814 - 1826