RESMatch: Referring expression segmentation in a semi-supervised manner

被引：0

作者：

Zang, Ying ^{[1
]}

Cao, Runlong ^{[1
]}

Fu, Chenglong ^{[1
]}

Zhu, Didi ^{[2
]}

Zhang, Min ^{[2
]}

Hu, Wenjun ^{[1
]}

Zhu, Lanyun ^{[3
]}

Chen, Tianrun ^{[2
]}

机构：

[1] Huzhou Univ, Sch Informat Engn, Huzhou 313000, Peoples R China

[2] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China

[3] Singapore Univ Technol & Design, Informat Syst Technol & Design Pillar, Singapore 487372, Singapore

来源：

INFORMATION SCIENCES | 2025年 / 694卷

关键词：

Referring expression segmentation; Semi-supervised learning; Data augmentation; Model adaptation guidance;

D O I：

10.1016/j.ins.2024.121709

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Referring Expression segmentation (RES), a task that involves localizing specific instance-level objects on the basis of free-form linguistic descriptions, has emerged as a crucial frontier in human-AI interactions. It demands an intricate understanding of both the visual and textual contexts, and often requires extensive training data. This paper introduces RESMatch, the first semi-supervised learning (SSL) approach for RES aimed at reducing reliance on exhaustive data annotation. Moreover, our proposed RESMatch can leverage the abundance of image-text paired data available in the current era of training large models, resulting in an improved RES task performance without the need for costly semantic annotation, as evidenced by our experimental results. Moreover, although existing SSL techniques are effective in image segmentation, we find that they fall short in RES. Facing challenges such as the comprehension of free-form linguistic descriptions and the variability in object attributes, RESMatch introduces a trifecta of adaptations: revised strong perturbation, text augmentation, and adjustments for pseudolabel quality and strong-weak supervision. RESMatch has demonstrated its effectiveness by achieving state-of-theart (SOTA) results in various experimental settings. This pioneering work lays the groundwork for future research in semi-supervised learning for referring expression segmentation.

引用

页数：14

共 50 条

[1] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
[2] Chen LC, 2017, Arxiv, DOI arXiv:1706.05587
[3] Chen TR, 2024, Arxiv, DOI arXiv:2407.01530
[4] SAM-Adapter: Adapting Segment Anything in Underperformed Scenes
Chen, Tianrun
Zhu, Lanyun
Ding, Chaotao
Cao, Runlong
Wang, Yan
Zhang, Shangzhan
Li, Zejian
Sun, Lingyun
Zang, Ying
Mao, Papa
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 3359 - 3367
[5] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[6] Devlin J, 2019, Arxiv, DOI [arXiv:1810.04805, DOI 10.48550/ARXIV.1810.04805]
[7] RCAR-UNet: Retinal vessel segmentation network algorithm via novel rough attention mechanism
Ding, Weiping
Sun, Ying
Huang, Jiashuang
Ju, Hengrong
Zhang, Chongsheng
Yang, Guang
Lin, Chin-Teng
[J]. INFORMATION SCIENCES, 2024, 657
[8] RCTE: A reliable and consistent temporal-ensembling framework for semi-supervised segmentation of COVID-19 lesions
Ding, Weiping
Abdel-Basset, Mohamed
Hawash, Hossam
[J]. INFORMATION SCIENCES, 2021, 578 : 559 - 573
[9] Conservative-Progressive Collaborative Learning for Semi-Supervised Semantic Segmentation
Fan S.
Zhu F.
Feng Z.
Lv Y.
Song M.
Wang F.-Y.
[J]. IEEE Transactions on Image Processing, 2023, 32 : 6183 - 6194
[10] Surface treated TiO2 nanorod arrays for the improvement of water splitting
He, Chao
Peng, Xiaoniu
Liu, Qingyun
Fan, Xi
Wang, Hao
[J]. INTERNATIONAL JOURNAL OF HYDROGEN ENERGY, 2014, 39 (25) : 13415 - 13420

← 1 2 3 4 5 →