Bidirectional Relationship Inferring Network for Referring Image Localization and Segmentation

被引:11
|
作者
Feng, Guang [1 ]
Hu, Zhiwei [1 ]
Zhang, Lihe [1 ]
Sun, Jiayu [1 ]
Lu, Huchuan [1 ]
机构
[1] Dalian Univ Technol, Sch Informat & Commun Engn, Dalian 116024, Peoples R China
基金
中国国家自然科学基金;
关键词
Image segmentation; Location awareness; Visualization; Task analysis; Linguistics; Semantics; Feature extraction; Language-guided visual attention; referring image localization and segmentation; segmentation-guided feature augmentation; vision-guided linguistic attention (VLAM);
D O I
10.1109/TNNLS.2021.3106153
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, referring image localization and segmentation has aroused widespread interest. However, the existing methods lack a clear description of the interdependence between language and vision. To this end, we present a bidirectional relationship inferring network (BRINet) to effectively address the challenging tasks. Specifically, we first employ a vision-guided linguistic attention module to perceive the keywords corresponding to each image region. Then, language-guided visual attention adopts the learned adaptive language to guide the update of the visual features. Together, they form a bidirectional cross-modal attention module (BCAM) to achieve the mutual guidance between language and vision. They can help the network align the cross-modal features better. Based on the vanilla language-guided visual attention, we further design an asymmetric language-guided visual attention, which significantly reduces the computational cost by modeling the relationship between each pixel and each pooled subregion. In addition, a segmentation-guided bottom-up augmentation module (SBAM) is utilized to selectively combine multilevel information flow for object localization. Experiments show that our method outperforms other state-of-the-art methods on three referring image localization datasets and four referring image segmentation datasets.
引用
收藏
页码:2246 / 2258
页数:13
相关论文
共 50 条
  • [31] Hierarchical Context Network for Airborne Image Segmentation
    Zhou, Feng
    Hang, Renlong
    Shuai, Hui
    Liu, Qingshan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [32] Cross-modal fusion encoder via graph neural network for referring image segmentation
    Zhang, Yuqing
    Zhang, Yong
    Piao, Xinglin
    Yuan, Peng
    Hu, Yongli
    Yin, Baocai
    IET IMAGE PROCESSING, 2024, 18 (04) : 1083 - 1095
  • [33] SCA-Net: A Spatial and Channel Attention Network for Medical Image Segmentation
    Shan, Tong
    Yan, Jiayong
    IEEE ACCESS, 2021, 9 (09): : 160926 - 160937
  • [34] Geographic Semantic Network for Cross-View Image Geo-Localization
    Zhu, Yingying
    Sun, Bin
    Lu, Xiufan
    Jia, Sen
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [35] Multi-Level Object-Aware Guidance Network for Biomedical Image Segmentation
    Wu, Huisi
    Zhang, Baiming
    Pan, Junquan
    Qin, Jing
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, 21 (03) : 2440 - 2453
  • [36] A Gather-to-Guide Network for Remote Sensing Semantic Segmentation of RGB and Auxiliary Image
    Zheng, Xianwei
    Wu, Xiujie
    Huan, Linxi
    He, Wei
    Zhang, Hongyan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [37] WideSegNeXt: Semantic Image Segmentation Using Wide Residual Network and NeXt Dilated Unit
    Nakayama, Yoshiki
    Lu, Huimin
    Li, Yujie
    Kamiya, Tohru
    IEEE SENSORS JOURNAL, 2021, 21 (10) : 11427 - 11434
  • [38] Dense Feature Interaction Network for Image Inpainting Localization
    Yao, Ye
    Han, Tingfeng
    Jia, Shan
    Lyu, Siwei
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2025, 20 : 1636 - 1648
  • [39] PDLFBR-Net: Partial Decoder Localization and Foreground-Background Refinement Network for Polyp Segmentation
    Peng, Yanbin
    Feng, Mingkun
    Zhai, Zhinian
    Zheng, Zhijun
    IEEE ACCESS, 2024, 12 : 114280 - 114294
  • [40] FTNet: Feature Transverse Network for Thermal Image Semantic Segmentation
    Panetta, Karen
    Kamath, K. M. Shreyas
    Rajeev, Srijith
    Agaian, Sos S.
    IEEE ACCESS, 2021, 9 : 145212 - 145227