Mask-Free Video Instance Segmentation

被引:16
作者
Ke, Lei [1 ,2 ]
Danelljan, Martin [1 ]
Ding, Henghui [1 ]
Tai, Yu-Wing [2 ]
Tang, Chi-Keung [2 ]
Yu, Fisher [1 ]
机构
[1] Swiss Fed Inst Technol, Zurich, Switzerland
[2] HKUST, Hong Kong, Peoples R China
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年
关键词
D O I
10.1109/CVPR52729.2023.02189
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The recent advancement in Video Instance Segmentation (VIS) has largely been driven by the use of deeper and increasingly data-hungry transformer-based models. However, video masks are tedious and expensive to annotate, limiting the scale and diversity of existing VIS datasets. In this work, we aim to remove the mask-annotation requirement. We propose MaskFreeVIS, achieving highly competitive VIS performance, while only using bounding box annotations for the object state. We leverage the rich temporal mask consistency constraints in videos by introducing the Temporal KNN-patch Loss (TK-Loss), providing strong mask supervision without any labels. Our TK-Loss finds one-to-many matches across frames, through an efficient patch-matching step followed by a K-nearest neighbor selection. A consistency loss is then enforced on the found matches. Our mask-free objective is simple to implement, has no trainable parameters, is computationally efficient, yet outperforms baselines employing, e.g., state-of-the-art optical flow to enforce temporal mask consistency. We validate MaskFreeVIS on the YouTube-VIS 2019/2021, OVIS and BDD100K MOTS benchmarks. The results clearly demonstrate the efficacy of our method by drastically narrowing the gap between fully and weakly-supervised VIS performance. Our code and trained models are available at http://vis.xyz/pub/maskfreevis.
引用
收藏
页码:22857 / 22866
页数:10
相关论文
共 50 条
[31]   A Generalized Framework for Video Instance Segmentation [J].
Heo, Miran ;
Hwang, Sukjun ;
Hyun, Jeongseok ;
Kim, Hanjung ;
Oh, Seoung Wug ;
Lee, Joon-Young ;
Kim, Seon Joo .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :14623-14632
[32]   UVIS: Unsupervised Video Instance Segmentation [J].
Huang, Shuaiyi ;
Suri, Saksham ;
Gupta, Kamal ;
Rambhatla, Sai Saketh ;
Lim, Ser-nam ;
Shrivastava, Abhinav .
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW, 2024, :2682-2692
[33]   A Mask-Free Passivation Process for Low Noise Nanopore Devices [J].
Lim, Min-Cheol ;
Lee, Min-Hyun ;
Kim, Ki-Bum ;
Jeon, Tae-Joon ;
Kim, Young-Rok .
JOURNAL OF NANOSCIENCE AND NANOTECHNOLOGY, 2015, 15 (08) :5971-5977
[34]   Instance as Identity: A Generic Online Paradigm for Video Instance Segmentation [J].
Zhu, Feng ;
Yang, Zongxin ;
Yu, Xin ;
Yang, Yi ;
Wei, Yunchao .
COMPUTER VISION, ECCV 2022, PT XXIX, 2022, 13689 :524-540
[35]   Text-Guided Mask-Free Local Image Retouching [J].
Liu, Zerun ;
Zhang, Fan ;
He, Jingxuan ;
Wang, Jin ;
Wang, Zhangye ;
Cheng, Lechao .
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, :2783-2788
[36]   Instance segmentation of real time video for object detection using hybrid Mask RCNN-SVM [J].
Anu Yadav ;
Ela Kumar .
Multimedia Tools and Applications, 2024, 83 :50871-50891
[37]   Instance segmentation of real time video for object detection using hybrid Mask RCNN-SVM [J].
Yadav, Anu ;
Kumar, Ela .
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (17) :50871-50891
[38]   Mask attention interaction for SAR ship instance segmentation [J].
Zhang T. ;
Zhang X. ;
Shao Z. ;
Zeng T. .
Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2024, 46 (03) :831-838
[39]   Mask Transfiner for High-Quality Instance Segmentation [J].
Ke, Lei ;
Danelljan, Martin ;
Li, Xia ;
Tai, Yu-Wing ;
Tang, Chi-Keung ;
Yu, Fisher .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :4402-4411
[40]   A simple method of fabricating mask-free microfluidic devices for biological analysis [J].
Yi, Xin ;
Kodzius, Rimantas ;
Gong, Xiuqing ;
Xiao, Kang ;
Wen, Weijia .
BIOMICROFLUIDICS, 2010, 4 (03)