Text-Guided Portrait Image Matting

被引:0
|
作者
Xu Y. [1 ]
Yao X. [1 ]
Liu B. [1 ]
Quan Y. [1 ]
Ji H. [2 ]
机构
[1] School of Computer Science and Engineering, South China University of Technology, Guangzhou
[2] Department of Mathematics, National University of Singapore
来源
关键词
Annotations; Artificial intelligence; Artificial neural networks; Attention; Batch production systems; Cross-modal Learning; Data mining; Feature extraction; Image Matting; Text Gudiance; Training;
D O I
10.1109/TAI.2024.3363120
中图分类号
学科分类号
摘要
Image matting is a technique used to separate the foreground of an image from the background, which estimates an alpha matte that indicates pixel-wise degree of transparency. To precisely extract target objects and address the ambiguity of solutions in image matting, many existing approaches employ a trimap or background image provided by the user as additional input to guide the matting process. This paper introduces a novel matting paradigm termed text-guided image matting, utilizing a textual description of the foreground object as a guiding element. In contrast to trimap or background-based methods, text-guided matting offers a user-friendly interface, providing semantic clues for the objects of interest. Moreover, it facilitates batch processing across multiple frames featuring the same objects of interest. The proposed text-guided matting approach is implemented through a deep neural network comprising three-stage cross-modal feature fusion and two-step alpha matte prediction. Experimental results on portrait matting demonstrate the competitive performance of our text-guided approach compared to existing trimap-based and background-based methods. IEEE
引用
收藏
页码:1 / 13
页数:12
相关论文
共 50 条
  • [21] MMFL: Multimodal Fusion Learning for Text-Guided Image Inpainting
    Lin, Qing
    Yan, Bo
    Li, Jichun
    Tan, Weimin
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1094 - 1102
  • [22] Adversarial Learning with Mask Reconstruction for Text-Guided Image Inpainting
    Wu, Xingcai
    Xie, Yucheng
    Zeng, Jiaqi
    Yang, Zhenguo
    Yu, Yi
    Li, Qing
    Liu, Wenyin
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 3464 - 3472
  • [23] Dilated Residual Aggregation Network for Text-Guided Image Manipulation
    Lu, Siwei
    Luo, Di
    Yang, Zhenguo
    Hao, Tianyong
    Li, Qing
    Liu, Wenyin
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT III, 2021, 12893 : 28 - 40
  • [24] LivePhoto: Real Image Animation with Text-Guided Motion Control
    Chen, Xi
    Liu, Zhiheng
    Chen, Mengting
    Feng, Yutong
    Liu, Yu
    Shen, Yujun
    Zhao, Hengshuang
    COMPUTER VISION-ECCV 2024, PT XVIII, 2025, 15076 : 475 - 491
  • [25] TediGAN: Text-Guided Diverse Face Image Generation and Manipulation
    Xia, Weihao
    Yang, Yujiu
    Xue, Jing-Hao
    Wu, Baoyuan
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 2256 - 2265
  • [26] Lightweight Generative Adversarial Networks for Text-Guided Image Manipulation
    Li, Bowen
    Qi, Xiaojuan
    Torr, Philip H. S.
    Lukasiewicz, Thomas
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [27] Text-Guided Mask-Free Local Image Retouching
    Liu, Zerun
    Zhang, Fan
    He, Jingxuan
    Wang, Jin
    Wang, Zhangye
    Cheng, Lechao
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2783 - 2788
  • [28] Text-Guided Foundation Model Adaptation for Pathological Image Classification
    Zhang, Yunkun
    Gao, Jin
    Zhou, Mu
    Wang, Xiaosong
    Qiao, Yu
    Zhang, Shaoting
    Wang, Dequan
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT V, 2023, 14224 : 272 - 282
  • [29] Text-Guided Generative Adversarial Network for Image Emotion Transfer
    Zhu, Siqi
    Qing, Chunmei
    Xu, Xiangmin
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT II, 2023, 14087 : 506 - 522
  • [30] DiffusionCLIP Text-Guided Diffusion Models for Robust Image Manipulation
    Kim, Gwanghyun
    Kwon, Taesung
    Ye, Jong Chul
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 2416 - 2425