REVIVING ITERATIVE TRAINING WITH MASK GUIDANCE FOR INTERACTIVE SEGMENTATION

被引：120

作者：

Sofiiuk, Konstantin ^{[1
]}

Petrov, Ilya A. ^{[2
]}

Konushin, Anton ^{[1
]}

机构：

[1] Samsung AI Ctr, Moscow, Russia

[2] Univ Tubingen, Tubingen, Germany

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP | 2022年

关键词：

interactive segmentation; segmentation; mask refinement;

D O I：

10.1109/ICIP46576.2022.9897365

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent works on click-based interactive segmentation have demonstrated state-of-the-art results by using various inference-time optimization schemes. These methods are significantly more computationally expensive than feedforward approaches, as they run backward gradient passes during inference. Moreover, backward passes are not supported in popular mobile frameworks, which complicates the deployment of such methods on embedded devices. In this paper, we study design choices for interactive segmentation and discover that state-of-the-art results can be obtained without any additional optimization schemes. We propose a simple feedforward model for click-based interactive segmentation that employs the segmentation masks from previous steps. It allows not only segmenting an entirely new object but also correcting an existing mask. We analyze the performance of models trained on different datasets and observe that the choice of a training dataset has a large impact on the quality of interactive segmentation. We find that the models trained on a combination of COCO and LVIS with diverse and high-quality annotations outperform all existing models. The code and trained models are available at https://github.com/saic-vul/ritm_interactive_segmentation.

引用

页码：3141 / 3145

页数：5

共 19 条

[1]

[Anonymous], 2020, COMPUTER VISION ECCV, DOI DOI 10.1109/CVPR42600.2020.00025

[2] Large-scale interactive object segmentation with human annotators [J].

Benenson, Rodrigo ;

Popov, Stefan ;

Ferrari, Vittorio .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :11692-11701

[3]

Chen L., 2018, PROC EUR C COMPUT VI, V2018, P833

[4] The Pascal Visual Object Classes (VOC) Challenge [J].

Everingham, Mark ;

Van Gool, Luc ;

Williams, Christopher K. I. ;

Winn, John ;

Zisserman, Andrew .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338

[5] LVIS: A Dataset for Large Vocabulary Instance Segmentation [J].

Gupta, Agrim ;

Dollar, Piotr ;

Girshick, Ross .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5351-5359

[6]

Hariharan B, 2011, IEEE I CONF COMP VIS, P991, DOI 10.1109/ICCV.2011.6126343

[7] Interactive Image Segmentation via Backpropagating Refinement Scheme [J].

Jang, Won-Dong ;

Kim, Chang-Su .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5292-5301

[8] Interactive Image Segmentation with Latent Diversity [J].

Li, Zhuwen ;

Chen, Qifeng ;

Koltun, Vladlen .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :577-585

[9] Microsoft COCO: Common Objects in Context [J].

Lin, Tsung-Yi ;

Maire, Michael ;

Belongie, Serge ;

Hays, James ;

Perona, Pietro ;

Ramanan, Deva ;

Dollar, Piotr ;

Zitnick, C. Lawrence .

COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 :740-755

[10]

Lin Z, 2020, I C CONT AUTOMAT ROB, P333, DOI [10.1109/icarcv50220.2020.9305400, 10.1109/ICARCV50220.2020.9305400]

← 1 2 →