Biphasic Face Photo-Sketch Synthesis via Semantic-Driven Generative Adversarial Network With Graph Representation Learning

被引:5
|
作者
Qi, Xingqun [1 ,2 ,3 ]
Sun, Muyi [4 ,5 ]
Wang, Zijian [6 ]
Liu, Jiaming [7 ]
Li, Qi [4 ]
Zhao, Fang [8 ]
Zhang, Shanghang [7 ]
Shan, Caifeng [8 ,9 ]
机构
[1] Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
[2] Peking Univ, Sch Comp Sci, Beijing 100871, Peoples R China
[3] Hong Kong Univ Sci & Technol, Acad Interdisciplinary Studies, Hong Kong, Peoples R China
[4] Chinese Acad Sci, Inst Automat, NLPR, CRIPAC, Beijing 100190, Peoples R China
[5] Beijing Univ Posts & Telecommun, Sch Artificial Intelligence, Beijing 100876, Peoples R China
[6] Univ Sydney, Sch Comp Sci, Sydney, NSW 2008, Australia
[7] Peking Univ, Sch Comp Sci, Natl Key Lab Multimedia Informat Proc, Beijing 100871, Peoples R China
[8] Nanjing Univ, Sch Intelligence Sci & Technol, Nanjing 210023, Peoples R China
[9] Shandong Univ Sci & Technol, Coll Elect Engn & Automat, Qingdao 266590, Peoples R China
关键词
Face photo-sketch synthesis; generative adversarial network; graph representation learning; intraclass and interclass; iterative cycle training (ICT);
D O I
10.1109/TNNLS.2023.3341246
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Biphasic face photo-sketch synthesis has significant practical value in wide-ranging fields such as digital entertainment and law enforcement. Previous approaches directly generate the photo-sketch in a global view, they always suffer from the low quality of sketches and complex photograph variations, leading to unnatural and low-fidelity results. In this article, we propose a novel semantic-driven generative adversarial network to address the above issues, cooperating with graph representation learning. Considering that human faces have distinct spatial structures, we first inject class-wise semantic layouts into the generator to provide style-based spatial information for synthesized face photographs and sketches. In addition, to enhance the authenticity of details in generated faces, we construct two types of representational graphs via semantic parsing maps upon input faces, dubbed the intraclass semantic graph (IASG) and the interclass structure graph (IRSG). Specifically, the IASG effectively models the intraclass semantic correlations of each facial semantic component, thus producing realistic facial details. To preserve the generated faces being more structure-coordinated, the IRSG models interclass structural relations among every facial component by graph representation learning. To further enhance the perceptual quality of synthesized images, we present a biphasic interactive cycle training strategy by fully taking advantage of the multilevel feature consistency between the photograph and sketch. Extensive experiments demonstrate that our method outperforms the state-of-the-art competitors on the CUHK Face Sketch (CUFS) and CUHK Face Sketch FERET (CUFSF) datasets.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 42 条
  • [41] TELL YOUR STORY: TEXT-DRIVEN FACE VIDEO SYNTHESIS WITH HIGH DIVERSITY VIA ADVERSARIAL LEARNING
    Hou, Xia
    Sun, Meng
    Song, Wenfeng
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 515 - 519
  • [42] FIN-GAN: Face illumination normalization via retinex-based self-supervised learning and conditional generative adversarial network
    Hu, Yaocong
    Lu, Mingqi
    Xie, Chao
    Lu, Xiaobo
    NEUROCOMPUTING, 2021, 456 : 109 - 125