Complementary Coarse-to-Fine Matching for Video Object Segmentation

被引:2
|
作者
Chen, Zhen [1 ]
Yang, Ming [2 ]
Zhang, Shiliang [1 ]
机构
[1] Peking Univ, Sch Comp Sci, Natl Key Lab Multimedia Informat Proc, 5 Yiheyuan Rd, Beijing 100871, Peoples R China
[2] Ant Grp, Multimodal Cognit, 525 Almanor Ave, Sunnyvale, CA 94085 USA
关键词
Video object segmentation; coarse-to-fine matching; label propagation;
D O I
10.1145/3596496
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Semi-supervised Video Object Segmentation (VOS) needs to establish pixel-level correspondences between a video frame and preceding segmented frames to leverage their segmentation clues. Most works rely on features at a single scale to establish those correspondences, e.g., perform densematching with Convolutional Neural Network (CNN) features from a deep layer. Differently, this work explores complementary features at different scales to pursue more robust feature matching. A coarse feature from a deep layer is first adopted to get coarse pixel-level correspondences. We hence evaluate the quality of those correspondences, and select pixels with low-quality correspondences for fine-scale feature matching. Segmentation clues of previous frames are propagated by both coarse and fine-scale correspondences, which are fused with appearance features for object segmentation. Compared with previous works, this coarse-to-fine matching scheme is more robust to distractions by similar objects and better preserves object details. The sparse fine-scale matching also ensures a fast inference speed. On popular VOS datasets including DAVIS and YouTube-VOS, the proposed method shows promising performance compared with recent works.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] COARSE-TO-FINE MOVING REGION SEGMENTATION IN COMPRESSED VIDEO
    Chen, Yue-Meng
    Bajic, Ivan V.
    Saeedi, Parvaneh
    2009 10TH INTERNATIONAL WORKSHOP ON IMAGE ANALYSIS FOR MULTIMEDIA INTERACTIVE SERVICES, 2009, : 45 - 48
  • [2] Coarse-to-Fine Feature Mining for Video Semantic Segmentation
    Sun, Guolei
    Liu, Yun
    Ding, Henghui
    Probst, Thomas
    Van Gool, Luc
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3116 - 3127
  • [3] Coarse-to-fine online learning for hand segmentation in egocentric video
    Zhao, Ying
    Luo, Zhiwei
    Quan, Changqin
    EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2018,
  • [4] Coarse-to-fine online learning for hand segmentation in egocentric video
    Ying Zhao
    Zhiwei Luo
    Changqin Quan
    EURASIP Journal on Image and Video Processing, 2018
  • [5] Coarse-to-fine Semantic Video Segmentation using Supervoxel Trees
    Jain, Aastha
    Chatterjee, Shaunak
    Vidal, Rene
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 1865 - 1872
  • [6] Unsupervised Single Moving Object Detection Based on Coarse-to-Fine Segmentation
    Zhu, Xiaozhou
    Song, Xin
    Chen, Xiaoqian
    Lu, Huimin
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2016, 10 (06): : 2669 - 2688
  • [7] Coarse-to-Fine Video Instance Segmentation With Factorized Conditional Appearance Flows
    Qin, Zheyun
    Lu, Xiankai
    Nie, Xiushan
    Liu, Dongfang
    Yin, Yilong
    Wang, Wenguan
    IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2023, 10 (05) : 1192 - 1208
  • [8] Coarse-to-Fine Video Instance Segmentation With Factorized Conditional Appearance Flows
    Zheyun Qin
    Xiankai Lu
    Xiushan Nie
    Dongfang Liu
    Yilong Yin
    Wenguan Wang
    IEEE/CAA Journal of Automatica Sinica, 2023, 10 (05) : 1192 - 1208
  • [9] Coarse-to-Fine Region Selection and Matching
    Yang, Yanchao
    Lu, Zhaojin
    Sundaramoorthi, Ganesh
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 5051 - 5059
  • [10] A Coarse-to-Fine Network for Craniopharyngioma Segmentation
    Yu, Yijie
    Zhang, Lei
    Shu, Xin
    Wang, Zizhou
    Chen, Chaoyue
    Xu, Jianguo
    MACHINE LEARNING IN MEDICAL IMAGING, MLMI 2022, 2022, 13583 : 91 - 100