RESMatch: Referring expression segmentation in a semi-supervised manner

被引:0
作者
Zang, Ying [1 ]
Cao, Runlong [1 ]
Fu, Chenglong [1 ]
Zhu, Didi [2 ]
Zhang, Min [2 ]
Hu, Wenjun [1 ]
Zhu, Lanyun [3 ]
Chen, Tianrun [2 ]
机构
[1] Huzhou Univ, Sch Informat Engn, Huzhou 313000, Peoples R China
[2] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China
[3] Singapore Univ Technol & Design, Informat Syst Technol & Design Pillar, Singapore 487372, Singapore
关键词
Referring expression segmentation; Semi-supervised learning; Data augmentation; Model adaptation guidance;
D O I
10.1016/j.ins.2024.121709
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Referring Expression segmentation (RES), a task that involves localizing specific instance-level objects on the basis of free-form linguistic descriptions, has emerged as a crucial frontier in human-AI interactions. It demands an intricate understanding of both the visual and textual contexts, and often requires extensive training data. This paper introduces RESMatch, the first semi-supervised learning (SSL) approach for RES aimed at reducing reliance on exhaustive data annotation. Moreover, our proposed RESMatch can leverage the abundance of image-text paired data available in the current era of training large models, resulting in an improved RES task performance without the need for costly semantic annotation, as evidenced by our experimental results. Moreover, although existing SSL techniques are effective in image segmentation, we find that they fall short in RES. Facing challenges such as the comprehension of free-form linguistic descriptions and the variability in object attributes, RESMatch introduces a trifecta of adaptations: revised strong perturbation, text augmentation, and adjustments for pseudolabel quality and strong-weak supervision. RESMatch has demonstrated its effectiveness by achieving state-of-theart (SOTA) results in various experimental settings. This pioneering work lays the groundwork for future research in semi-supervised learning for referring expression segmentation.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
  • [2] Chen LC, 2017, Arxiv, DOI arXiv:1706.05587
  • [3] Chen TR, 2024, Arxiv, DOI arXiv:2407.01530
  • [4] SAM-Adapter: Adapting Segment Anything in Underperformed Scenes
    Chen, Tianrun
    Zhu, Lanyun
    Ding, Chaotao
    Cao, Runlong
    Wang, Yan
    Zhang, Shangzhan
    Li, Zejian
    Sun, Lingyun
    Zang, Ying
    Mao, Papa
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 3359 - 3367
  • [5] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [6] Devlin J, 2019, Arxiv, DOI [arXiv:1810.04805, DOI 10.48550/ARXIV.1810.04805]
  • [7] RCAR-UNet: Retinal vessel segmentation network algorithm via novel rough attention mechanism
    Ding, Weiping
    Sun, Ying
    Huang, Jiashuang
    Ju, Hengrong
    Zhang, Chongsheng
    Yang, Guang
    Lin, Chin-Teng
    [J]. INFORMATION SCIENCES, 2024, 657
  • [8] RCTE: A reliable and consistent temporal-ensembling framework for semi-supervised segmentation of COVID-19 lesions
    Ding, Weiping
    Abdel-Basset, Mohamed
    Hawash, Hossam
    [J]. INFORMATION SCIENCES, 2021, 578 : 559 - 573
  • [9] Conservative-Progressive Collaborative Learning for Semi-Supervised Semantic Segmentation
    Fan S.
    Zhu F.
    Feng Z.
    Lv Y.
    Song M.
    Wang F.-Y.
    [J]. IEEE Transactions on Image Processing, 2023, 32 : 6183 - 6194
  • [10] Surface treated TiO2 nanorod arrays for the improvement of water splitting
    He, Chao
    Peng, Xiaoniu
    Liu, Qingyun
    Fan, Xi
    Wang, Hao
    [J]. INTERNATIONAL JOURNAL OF HYDROGEN ENERGY, 2014, 39 (25) : 13415 - 13420