Integrating instance-level knowledge to see the unseen: A two-stream network for video object segmentation

被引：1

作者：

Lu, Hannan ^{[1
]}

Tian, Zhi ^{[1
]}

Wei, Pengxu ^{[1
]}

Ren, Haibing ^{[1
]}

Zuo, Wangmeng ^{[1
]}

机构：

[1] Harbin Inst Technol, Sch Comp Sci & Technol, 92 Xidazhi St, Harbin 150006, Peoples R China

来源：

NEUROCOMPUTING | 2024年 / 602卷

基金：

中国国家自然科学基金;

关键词：

Video object segmentation; Matching-based; Two-stream network; Pixel division; Instance stream;

D O I：

10.1016/j.neucom.2024.127878

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Existing matching-based video object segmentation (VOS) approaches carry inherent limitations in segmenting pixels that have never appeared in the previous frames ( i.e. , unseen pixels). In this paper, we introduce a T wo- S tream N etwork (TSN), which addresses this issue by distinguishing between seen and unseen pixels softly and processes them with two streams. Particularly, a pixel division module is devised to generate a routing map, distinguishing between seen and unseen pixels. Guided by the routing map, TSN integrates instance-level knowledge from an instance stream and pixel-level information from a pixel stream explicitly, generating the final segmentation result. The soft partitioning strategy allows for flexibility and adaptability in the fusion process. Additionally, the compact instance stream encodes and leverages instance-level knowledge, resulting in improved segmentation accuracy of the unseen pixels. Extensive experiments demonstrate the effectiveness of our proposed TSN, and we also report state-of-the-art performance on public VOS benchmarks.

引用

页数：12

共 66 条

[21] Lucid Data Dreaming for Video Object Segmentation [J].

Khoreva, Anna ;

Benenson, Rodrigo ;

Ilg, Eddy ;

Brox, Thomas ;

Schiele, Bernt .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2019, 127 (09) :1175-1197

[22] Coherence-aware context aggregator for fast video object segmentation [J].

Lan, Meng ;

Zhang, Jing ;

Wang, Zengmao .

PATTERN RECOGNITION, 2023, 136

[23] Flow Guided Recurrent Neural Encoder for Video Salient Object Detection [J].

Li, Guanbin ;

Xie, Yuan ;

Wei, Tianhao ;

Wang, Keze ;

Lin, Liang .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3243-3252

[24] Recurrent Dynamic Embedding for Video Object Segmentation [J].

Li, Mingxing ;

Hu, Li ;

Xiong, Zhiwei ;

Zhang, Bang ;

Pan, Pan ;

Liu, Dong .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :1322-1331

[25] Motion cues guided feature aggregation and enhancement for video object segmentation [J].

Li, Xuejun ;

Zheng, Wenming ;

Zong, Yuan .

NEUROCOMPUTING, 2022, 493 :176-190

[26] SiamPolar: Semi-supervised realtime video object segmentation with polar representation [J].

Li, Yaochen ;

Hong, Yuhui ;

Song, Yonghong ;

Zhu, Chao ;

Zhang, Ying ;

Wang, Ruihao .

NEUROCOMPUTING, 2022, 467 :491-503

[27]

Liang Yongqing, 2020, Advances in Neural Information Processing Systems, V33

[28] AGSS-VOS: Attention Guided Single-Shot Video Object Segmentation [J].

Lin, Huaijia ;

Qi, Xiaojuan ;

Jia, Jiaya .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :3948-3956

[29] Microsoft COCO: Common Objects in Context [J].

Lin, Tsung-Yi ;

Maire, Michael ;

Belongie, Serge ;

Hays, James ;

Perona, Pietro ;

Ramanan, Deva ;

Dollar, Piotr ;

Zitnick, C. Lawrence .

COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 :740-755

[30] SWEM: Towards Real-Time Video Object Segmentation with Sequential Weighted Expectation-Maximization [J].

Lin, Zhihui ;

Yang, Tianyu ;

Li, Maomao ;

Wang, Ziyu ;

Yuan, Chun ;

Jiang, Wenhao ;

Liu, Wei .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :1352-1362

← 1 2 3 4 5 6 7 →