Multi-Label Conditional Generation From Pre-Trained Models

被引:0
|
作者
Proszewska, Magdalena [1 ]
Wolczyk, Maciej [1 ]
Zieba, Maciej [2 ,3 ]
Wielopolski, Patryk [4 ]
Maziarka, Lukasz [1 ]
Smieja, Marek [1 ]
机构
[1] Jagiellonian Univ, Fac Math & Comp Sci, PL-31007 Krakow, Poland
[2] Tooploox, PL-53601 Wroclaw, Poland
[3] Wroclaw Univ Sci & Technol, PL-53601 Wroclaw, Poland
[4] Wroclaw Univ Sci & Technol, PL-50370 Wroclaw, Poland
关键词
Training; Computational modeling; Adaptation models; Vectors; Data models; Aerospace electronics; Three-dimensional displays; Conditional generation; deep generative models; GANs; invertible normalizing flows; pre-trained models; VAEs;
D O I
10.1109/TPAMI.2024.3382008
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although modern generative models achieve excellent quality in a variety of tasks, they often lack the essential ability to generate examples with requested properties, such as the age of the person in the photo or the weight of the generated molecule. To overcome these limitations we propose PluGeN (Plugin Generative Network), a simple yet effective generative technique that can be used as a plugin for pre-trained generative models. The idea behind our approach is to transform the entangled latent representation using a flow-based module into a multi-dimensional space where the values of each attribute are modeled as an independent one-dimensional distribution. In consequence, PluGeN can generate new samples with desired attributes as well as manipulate labeled attributes of existing examples. Due to the disentangling of the latent representation, we are even able to generate samples with rare or unseen combinations of attributes in the dataset, such as a young person with gray hair, men with make-up, or women with beards. In contrast to competitive approaches, PluGeN can be trained on partially labeled data. We combined PluGeN with GAN and VAE models and applied it to conditional generation and manipulation of images, chemical molecule modeling and 3D point clouds generation.
引用
收藏
页码:6185 / 6198
页数:14
相关论文
共 50 条
  • [21] Pre-Trained Models Based Receiver Design With Natural Redundancy for Chinese Characters
    Wang, Zhen-Yu
    Yu, Hong-Yi
    Shen, Cai-Yao
    Zhu, Zhao-Rui
    Shen, Zhi-Xiang
    Du, Jian-Ping
    IEEE COMMUNICATIONS LETTERS, 2022, 26 (10) : 2350 - 2354
  • [22] Memory-Tuning: A Unified Parameter-Efficient Tuning Method for Pre-Trained Language Models
    Qi, Wang
    Liu, Rui
    Zuo, Yuan
    Li, Fengzhi
    Chen, Yong
    Wu, Junjie
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2025, 33 : 1 - 10
  • [23] Conditional Consistency Regularization for Semi-Supervised Multi-Label Image Classification
    Wu, Zhengning
    He, Tianyu
    Xia, Xiaobo
    Yu, Jun
    Shen, Xu
    Liu, Tongliang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 4206 - 4216
  • [24] MaskDiffusion: Exploiting Pre-Trained Diffusion Models for Semantic Segmentation
    Kawano, Yasufumi
    Aoki, Yoshimitsu
    IEEE ACCESS, 2024, 12 : 127283 - 127293
  • [25] BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese
    Nguyen Luong Tran
    Duong Minh Le
    Dat Quoc Nguyen
    INTERSPEECH 2022, 2022, : 1751 - 1755
  • [26] Performance Evaluation of Pre-trained Models in Sarcasm Detection Task
    Wang, Haiyang
    Song, Xin
    Zhou, Bin
    Wang, Ye
    Gao, Liqun
    Jia, Yan
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2021, PT II, 2021, 13081 : 67 - 75
  • [27] Quantifying Gender Bias in Arabic Pre-Trained Language Models
    Alrajhi, Wafa
    Al-Khalifa, Hend S.
    Al-Salman, Abdulmalik S.
    IEEE ACCESS, 2024, 12 : 77406 - 77420
  • [28] Rethinking Textual Adversarial Defense for Pre-Trained Language Models
    Wang, Jiayi
    Bao, Rongzhou
    Zhang, Zhuosheng
    Zhao, Hai
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2526 - 2540
  • [29] Low-Rank Adaptation of Pre-Trained Large Vision Models for Improved Lung Nodule Malignancy Classification
    Veasey, Benjamin P.
    Amini, Amir A.
    IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY, 2025, 6 : 296 - 304
  • [30] Enhancing Skeleton-Based Action Recognition With Language Descriptions From Pre-Trained Large Multimodal Models
    He, Tian
    Chen, Yang
    Gao, Xu
    Wang, Ling
    Hu, Ting
    Cheng, Hong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (03) : 2118 - 2132