Unsupervised Video Object Segmentation via Weak User Interaction and Temporal Modulation

被引：0

作者：

FAN Jiaqing ^{[1
]}

ZHANG Kaihua ^{[2
,3
]}

ZHAO Yaqian ^{[4
]}

LIU Qingshan ^{[2
,3
]}

机构：

[1] College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics

[2] College of Computer and Software, Nanjing University of Information Science and Technology

[3] Engineering Research Center of Digital Forensics, Ministry of Education

[4] Inspur Suzhou Intelligent Technology Corporation

来源：

Chinese Journal of Electronics | 2023年 / 32卷 / 03期

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP391.41 [];

学科分类号：

080203 ;

摘要：

In unsupervised video object segmentation(UVOS), the whole video might segment the wrong target due to the lack of initial prior information. Also, in semi-supervised video object segmentation(SVOS), the initial video frame with a fine-grained pixel-level mask is essential to good segmentation accuracy. It is expensive and laborious to provide the accurate pixel-level masks for each training sequence. To address this issue, We present a weak user interactive UVOS approach guided by a simple human-made rectangle annotation in the initial frame. We first interactively draw the region of interest by a rectangle, and then we leverage the mask RCNN(region-based convolutional neural networks) method to generate a set of coarse reference labels for subsequent mask propagations. To establish the temporal correspondence between the coherent frames, we further design two novel temporal modulation modules to enhance the target representations. We compute the earth mover’s distance(EMD)-based similarity between coherent frames to mine the co-occurrent objects in the two images, which is used to modulate the target representation to highlight the foreground target. We design a cross-squeeze temporal modulation module to emphasize the co-occurrent features across frames, which further helps to enhance the foreground target representation. We augment the temporally modulated representations with the original representation and obtain the compositive spatio-temporal information, producing a more accurate video object segmentation(VOS) model. The experimental results on both UVOS and SVOS datasets including Davis2016,FBMS, Youtube-VOS, and Davis2017, show that our method yields favorable accuracy and complexity. The related code is available.

引用

页码：507 / 518

页数：12

共 50 条

[1] Unsupervised Video Object Segmentation via Weak User Interaction and Temporal Modulation
Fan Jiaqing
Zhang Kaihua
Zhao Yaqian
Liu Qingshan
CHINESE JOURNAL OF ELECTRONICS, 2023, 32 (03) : 507 - 518
[2] Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation
Zhuge, Yunzhi
Gu, Hongyu
Zhang, Lu
Qi, Jinqing
Lu, Huchuan
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
[3] TSANET: TEMPORAL AND SCALE ALIGNMENT FOR UNSUPERVISED VIDEO OBJECT SEGMENTATION
Lee, Seunghoon
Cho, Suhwan
Lee, Dogyoon
Lee, Minhyeok
Lee, Sangyoun
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1535 - 1539
[4] Tsanet: Temporal and Scale Alignment for Unsupervised Video Object Segmentation
Lee, Seunghoon
Cho, Suhwan
Lee, Dogyoon
Lee, Minhyeok
Lee, Sangyoun
Proceedings - International Conference on Image Processing, ICIP, 2023, : 1535 - 1539
[5] Unsupervised Video Object Segmentation via Prototype Memory Network
Yonsei University, Korea, Republic of
不详
Proc. - IEEE Winter Conf. Appl. Comput. Vis., WACV, 1600, (5913-5923):
[6] Unsupervised Video Object Segmentation via Prototype Memory Network
Lee, Minhyeok
Cho, Suhwan
Lee, Seunghoon
Park, Chaewon
Lee, Sangyoun
arXiv, 2022,
[7] Unsupervised Video Object Segmentation via Prototype Memory Network
Lee, Minhyeok
Cho, Suhwan
Lee, Seunghoon
Park, Chaewon
Lee, Sangyoun
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 5913 - 5923
[8] Temporal video segmentation using unsupervised clustering and semantic object tracking
Günsel, B
Ferman, AM
Tekalp, AM
JOURNAL OF ELECTRONIC IMAGING, 1998, 7 (03) : 592 - 604
[9] Efficient Video Object Segmentation via Network Modulation
Yang, Linjie
Wang, Yanran
Xiong, Xuehan
Yang, Jianchao
Katsaggelos, Aggelos K.
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6499 - 6507
[10] Unsupervised video segmentation and object tracking
Sista, S
Kashyap, RL
COMPUTERS IN INDUSTRY, 2000, 42 (2-3) : 127 - 146

← 1 2 3 4 5 →