Learning invariant and uniformly distributed feature space for multi-view generation?

被引:1
作者
Lu, Yuqin [1 ,3 ]
Cao, Jiangzhong [1 ,5 ]
He, Shengfeng [2 ,3 ]
Guo, Jiangtao [1 ]
Zhou, Qiliang [1 ]
Dai, Qingyun [4 ,5 ]
机构
[1] Guangdong Univ Technol, Sch Informat Engn, Guangzhou 510006, Peoples R China
[2] Singapore Management Univ, Sch Comp & Informat Syst, Singapore 178902, Singapore
[3] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510006, Peoples R China
[4] Guangdong Polytech Normal Univ, Sch Elect & Informat, Guangzhou 510665, Peoples R China
[5] Key Lab Intellectual Property & Big Data, Guangzhou 510665, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-view generation; Generative adversarial networks; Contrastive learning; FUSION;
D O I
10.1016/j.inffus.2023.01.011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-view generation from a given single view is a significant, yet challenging problem with broad applications in the field of virtual reality and robotics. Existing methods mainly utilize the basic GAN-based structure to help directly learn a mapping between two different views. Although they can produce plausible results, they still struggle to recover faithful details and fail to generalize to unseen data. In this paper, we propose to learn invariant and uniformly distributed representations for multi-view generation with an "Alignment"and a "Uniformity"constraint (AU-GAN). Our method is inspired by the idea of contrastive learning to learn a well-regulated feature space for multi-view generation. Specifically, our feature extractor is supposed to extract view-invariant representation that captures intrinsic and essential knowledge of the input, and distribute all representations evenly throughout the space to enable the network to "explore"the entire feature space, thus avoiding poor generative ability on unseen data. Extensive experiments on multi-view generation for both faces and objects demonstrate the generative capability of our proposed method on generating realistic and high-quality views, especially for unseen data in wild conditions.
引用
收藏
页码:383 / 395
页数:13
相关论文
共 82 条
  • [1] StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows
    Abdal, Rameen
    Zhu, Peihao
    Mitra, Niloy J.
    Wonka, Peter
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2021, 40 (03):
  • [2] Achlioptas P., 2018, INT C MACH LEARN ICM, P40
  • [3] Seeing 3D chairs: exemplar part-based 2D-3D alignment using a large dataset of CAD models
    Aubry, Mathieu
    Maturana, Daniel
    Efros, Alexei A.
    Russell, Bryan C.
    Sivic, Josef
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 3762 - 3769
  • [4] Cao J, 2018, Arxiv, DOI arXiv:1802.07447
  • [5] Chan E.R., 2021, arXiv
  • [6] SofGAN: A Portrait Image Generator with Dynamic Styling
    Chen, Anpei
    Liu, Ruiyang
    Xie, Ling
    Chen, Zhang
    Su, Hao
    Yu, Jingyi
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2022, 41 (01):
  • [7] Chen Ting, 2019, 25 AMERICAS C INFORM
  • [8] Chen XL, 2020, Arxiv, DOI arXiv:2003.04297
  • [9] 3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction
    Choy, Christopher B.
    Xu, Danfei
    Gwak, Jun Young
    Chen, Kevin
    Savarese, Silvio
    [J]. COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 : 628 - 644
  • [10] Deng Yu, 2020, PROC IEEECVF C COMPU, P5154