HIGSA: Human image generation with self-attention

被引：4

作者：

Wu, Haoran ^{[1
]}

He, Fazhi ^{[1
]}

Si, Tongzhen ^{[1
]}

Duan, Yansong ^{[2
]}

Yan, Xiaohu ^{[3
]}

机构：

[1] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China

[2] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430072, Peoples R China

[3] Shenzhen Polytech, Sch Artificial Intelligence, Shenzhen 518055, Peoples R China

来源：

ADVANCED ENGINEERING INFORMATICS | 2023年 / 55卷

基金：

中国国家自然科学基金;

关键词：

Deep learning; GAN; Human image generation; Attention; NEURAL-NETWORK; RECOGNITION;

D O I：

10.1016/j.aei.2022.101856

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The goal of human image generation (HIG) is to synthesize a human image in a novel pose. HIG can potentially benefit various computer vision applications and engineering tasks. The recently-developed CNN -based approach applies the attention architecture to vision tasks. However, owing to the locality in CNNs, extracting and maintaining the long-range pixel interactions input images is difficult. Thus, existing human image generation methods face limited content representation. In this paper, we propose a novel human image generation framework called HIGSA that can utilize the position information from the input source image. The proposed HIGSA contains two complementary self-attention blocks to generate photo-realistic human images, named as stripe self-attention block (SSAB) and content attention block (CAB), respectively. In SSAB, this paper establishes global dependencies of human images and computes the attention map for each pixel based on its relative spatial positions concerning other pixels. In CAB, this paper introduces an effective feature extraction module to interactively enhance both person's appearance and shape feature representations. Therefore, the HIGSA framework inherently preserves the better appearance consistency and shape consistency with sharper details. Extensive experiments on mainstream datasets demonstrate that HIGSA achieves the state-of-the-art (SOTA) results.

引用

页数：8

共 50 条

[1] Research of Self-Attention in Image Segmentation
Cao, Fude
Zheng, Chunguang
Huang, Limin
Wang, Aihua
Zhang, Jiong
Zhou, Feng
Ju, Haoxue
Guo, Haitao
Du, Yuxia
JOURNAL OF INFORMATION TECHNOLOGY RESEARCH, 2022, 15 (01)
[2] Improve Image Captioning by Self-attention
Li, Zhenru
Li, Yaoyi
Lu, Hongtao
NEURAL INFORMATION PROCESSING, ICONIP 2019, PT V, 2019, 1143 : 91 - 98
[3] Self-Attention Technology in Image Segmentation
Cao, Fude
Lu, Xueyun
INTERNATIONAL CONFERENCE ON INTELLIGENT TRAFFIC SYSTEMS AND SMART CITY (ITSSC 2021), 2022, 12165
[4] Variational joint self-attention for image captioning
Shao, Xiangjun
Xiang, Zhenglong
Li, Yuanxiang
Zhang, Mingjie
IET IMAGE PROCESSING, 2022, 16 (08) : 2075 - 2086
[5] Sparse self-attention transformer for image inpainting
Huang, Wenli
Deng, Ye
Hui, Siqi
Wu, Yang
Zhou, Sanping
Wang, Jinjun
PATTERN RECOGNITION, 2024, 145
[6] Keyphrase Generation Based on Self-Attention Mechanism
Yang, Kehua
Wang, Yaodong
Zhang, Wei
Yao, Jiqing
Le, Yuquan
CMC-COMPUTERS MATERIALS & CONTINUA, 2019, 61 (02): : 569 - 581
[7] Self-Attention Mechanism in GANs for Molecule Generation
Chinnareddy, Sandeep
Grandhi, Pranav
Narayan, Apurva
20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 57 - 60
[8] Relation constraint self-attention for image captioning
Ji, Junzhong
Wang, Mingzhan
Zhang, Xiaodan
Lei, Minglong
Qu, Liangqiong
NEUROCOMPUTING, 2022, 501 : 778 - 789
[9] LayoutTransformer: Layout Generation and Completion with Self-attention
Gupta, Kamal
Lazarow, Justin
Achille, Alessandro
Davis, Larry
Mahadevan, Vijay
Shrivastava, Abhinav
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 984 - 994
[10] SA-SinGAN: self-attention for single-image generation adversarial networks
Chen, Xi
Zhao, Hongdong
Yang, Dongxu
Li, Yueyuan
Kang, Qing
Lu, Haiyan
MACHINE VISION AND APPLICATIONS, 2021, 32 (04)

← 1 2 3 4 5 →