Multi-Label Conditional Generation From Pre-Trained Models

被引：0

作者：

Proszewska, Magdalena ^{[1
]}

Wolczyk, Maciej ^{[1
]}

Zieba, Maciej ^{[2
,3
]}

Wielopolski, Patryk ^{[4
]}

Maziarka, Lukasz ^{[1
]}

Smieja, Marek ^{[1
]}

机构：

[1] Jagiellonian Univ, Fac Math & Comp Sci, PL-31007 Krakow, Poland

[2] Tooploox, PL-53601 Wroclaw, Poland

[3] Wroclaw Univ Sci & Technol, PL-53601 Wroclaw, Poland

[4] Wroclaw Univ Sci & Technol, PL-50370 Wroclaw, Poland

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2024年 / 46卷 / 09期

关键词：

Training; Computational modeling; Adaptation models; Vectors; Data models; Aerospace electronics; Three-dimensional displays; Conditional generation; deep generative models; GANs; invertible normalizing flows; pre-trained models; VAEs;

D O I：

10.1109/TPAMI.2024.3382008

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Although modern generative models achieve excellent quality in a variety of tasks, they often lack the essential ability to generate examples with requested properties, such as the age of the person in the photo or the weight of the generated molecule. To overcome these limitations we propose PluGeN (Plugin Generative Network), a simple yet effective generative technique that can be used as a plugin for pre-trained generative models. The idea behind our approach is to transform the entangled latent representation using a flow-based module into a multi-dimensional space where the values of each attribute are modeled as an independent one-dimensional distribution. In consequence, PluGeN can generate new samples with desired attributes as well as manipulate labeled attributes of existing examples. Due to the disentangling of the latent representation, we are even able to generate samples with rare or unseen combinations of attributes in the dataset, such as a young person with gray hair, men with make-up, or women with beards. In contrast to competitive approaches, PluGeN can be trained on partially labeled data. We combined PluGeN with GAN and VAE models and applied it to conditional generation and manipulation of images, chemical molecule modeling and 3D point clouds generation.

引用

页码：6185 / 6198

页数：14

共 50 条

[21] Pre-Trained Models Based Receiver Design With Natural Redundancy for Chinese Characters
Wang, Zhen-Yu
Yu, Hong-Yi
Shen, Cai-Yao
Zhu, Zhao-Rui
Shen, Zhi-Xiang
Du, Jian-Ping
IEEE COMMUNICATIONS LETTERS, 2022, 26 (10) : 2350 - 2354
[22] Memory-Tuning: A Unified Parameter-Efficient Tuning Method for Pre-Trained Language Models
Qi, Wang
Liu, Rui
Zuo, Yuan
Li, Fengzhi
Chen, Yong
Wu, Junjie
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2025, 33 : 1 - 10
[23] Conditional Consistency Regularization for Semi-Supervised Multi-Label Image Classification
Wu, Zhengning
He, Tianyu
Xia, Xiaobo
Yu, Jun
Shen, Xu
Liu, Tongliang
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 4206 - 4216
[24] MaskDiffusion: Exploiting Pre-Trained Diffusion Models for Semantic Segmentation
Kawano, Yasufumi
Aoki, Yoshimitsu
IEEE ACCESS, 2024, 12 : 127283 - 127293
[25] BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese
Nguyen Luong Tran
Duong Minh Le
Dat Quoc Nguyen
INTERSPEECH 2022, 2022, : 1751 - 1755
[26] Performance Evaluation of Pre-trained Models in Sarcasm Detection Task
Wang, Haiyang
Song, Xin
Zhou, Bin
Wang, Ye
Gao, Liqun
Jia, Yan
WEB INFORMATION SYSTEMS ENGINEERING - WISE 2021, PT II, 2021, 13081 : 67 - 75
[27] Quantifying Gender Bias in Arabic Pre-Trained Language Models
Alrajhi, Wafa
Al-Khalifa, Hend S.
Al-Salman, Abdulmalik S.
IEEE ACCESS, 2024, 12 : 77406 - 77420
[28] Rethinking Textual Adversarial Defense for Pre-Trained Language Models
Wang, Jiayi
Bao, Rongzhou
Zhang, Zhuosheng
Zhao, Hai
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2526 - 2540
[29] Low-Rank Adaptation of Pre-Trained Large Vision Models for Improved Lung Nodule Malignancy Classification
Veasey, Benjamin P.
Amini, Amir A.
IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY, 2025, 6 : 296 - 304
[30] Enhancing Skeleton-Based Action Recognition With Language Descriptions From Pre-Trained Large Multimodal Models
He, Tian
Chen, Yang
Gao, Xu
Wang, Ling
Hu, Ting
Cheng, Hong
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (03) : 2118 - 2132

← 1 2 3 4 5 →