Integrating instance-level knowledge to see the unseen: A two-stream network for video object segmentation

被引:1
作者
Lu, Hannan [1 ]
Tian, Zhi [1 ]
Wei, Pengxu [1 ]
Ren, Haibing [1 ]
Zuo, Wangmeng [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, 92 Xidazhi St, Harbin 150006, Peoples R China
基金
中国国家自然科学基金;
关键词
Video object segmentation; Matching-based; Two-stream network; Pixel division; Instance stream;
D O I
10.1016/j.neucom.2024.127878
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing matching-based video object segmentation (VOS) approaches carry inherent limitations in segmenting pixels that have never appeared in the previous frames ( i.e. , unseen pixels). In this paper, we introduce a T wo- S tream N etwork (TSN), which addresses this issue by distinguishing between seen and unseen pixels softly and processes them with two streams. Particularly, a pixel division module is devised to generate a routing map, distinguishing between seen and unseen pixels. Guided by the routing map, TSN integrates instance-level knowledge from an instance stream and pixel-level information from a pixel stream explicitly, generating the final segmentation result. The soft partitioning strategy allows for flexibility and adaptability in the fusion process. Additionally, the compact instance stream encodes and leverages instance-level knowledge, resulting in improved segmentation accuracy of the unseen pixels. Extensive experiments demonstrate the effectiveness of our proposed TSN, and we also report state-of-the-art performance on public VOS benchmarks.
引用
收藏
页数:12
相关论文
共 66 条
[21]   Lucid Data Dreaming for Video Object Segmentation [J].
Khoreva, Anna ;
Benenson, Rodrigo ;
Ilg, Eddy ;
Brox, Thomas ;
Schiele, Bernt .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2019, 127 (09) :1175-1197
[22]   Coherence-aware context aggregator for fast video object segmentation [J].
Lan, Meng ;
Zhang, Jing ;
Wang, Zengmao .
PATTERN RECOGNITION, 2023, 136
[23]   Flow Guided Recurrent Neural Encoder for Video Salient Object Detection [J].
Li, Guanbin ;
Xie, Yuan ;
Wei, Tianhao ;
Wang, Keze ;
Lin, Liang .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3243-3252
[24]   Recurrent Dynamic Embedding for Video Object Segmentation [J].
Li, Mingxing ;
Hu, Li ;
Xiong, Zhiwei ;
Zhang, Bang ;
Pan, Pan ;
Liu, Dong .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :1322-1331
[25]   Motion cues guided feature aggregation and enhancement for video object segmentation [J].
Li, Xuejun ;
Zheng, Wenming ;
Zong, Yuan .
NEUROCOMPUTING, 2022, 493 :176-190
[26]   SiamPolar: Semi-supervised realtime video object segmentation with polar representation [J].
Li, Yaochen ;
Hong, Yuhui ;
Song, Yonghong ;
Zhu, Chao ;
Zhang, Ying ;
Wang, Ruihao .
NEUROCOMPUTING, 2022, 467 :491-503
[27]  
Liang Yongqing, 2020, Advances in Neural Information Processing Systems, V33
[28]   AGSS-VOS: Attention Guided Single-Shot Video Object Segmentation [J].
Lin, Huaijia ;
Qi, Xiaojuan ;
Jia, Jiaya .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :3948-3956
[29]   Microsoft COCO: Common Objects in Context [J].
Lin, Tsung-Yi ;
Maire, Michael ;
Belongie, Serge ;
Hays, James ;
Perona, Pietro ;
Ramanan, Deva ;
Dollar, Piotr ;
Zitnick, C. Lawrence .
COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 :740-755
[30]   SWEM: Towards Real-Time Video Object Segmentation with Sequential Weighted Expectation-Maximization [J].
Lin, Zhihui ;
Yang, Tianyu ;
Li, Maomao ;
Wang, Ziyu ;
Yuan, Chun ;
Jiang, Wenhao ;
Liu, Wei .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :1352-1362