Quality-aware pattern diffusion for video object segmentation

被引:3
作者
Zhou, Chuanwei [1 ]
Xu, Chunyan [1 ]
Li, Jun [1 ]
Cui, Zhen [1 ]
Yang, Jian [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Jiangsu Key Lab Image & Video Understanding Social, PCA Lab,Key Lab Intelligent Percept & Syst High Di, Nanjing, Peoples R China
关键词
Video object segmentation; Quality -aware pattern diffusion; Quality alignment mechanism;
D O I
10.1016/j.neucom.2023.01.044
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, great progresses have been achieved under the support of memory mechanism when dealing with video object segmentation (VOS) problem. Despite its achievements, existing VOS approaches still suffer from abnormal samples, which derive from intrinsic video artifacts such as occlusion and motion blur. To mitigate the above issue, in this paper, we propose a quality-aware pattern diffusion (QPD) framework to boost the VOS performance. To achieve quality-aware pattern diffusion, a quality align-ment mechanism is proposed, and it aims to promote the contributions of those normal samples while suppressing those abnormal ones during the feature propagation/diffusion processes. With our proposed quality alignment mechanism, the diffused instance features could be kept staying in the normal feature space, keeping from feature contamination caused by those low-quality samples. We first introduce a learnable quality evaluator to assess the sample qualities in both the temporal domain (i.e., across the historical frames), as well as the spatial domain (i.e., within the current frame). To achieve adaptive his-torical feature propagation into the current instance, a quality-aware long-term context propagation module is proposed, with which more stable instance representations could be achieved through the established quality-aware feature propagation process. A quality-aware pattern diffusion module is fur-ther introduced to address the spatial-domain abnormal samples, resulting in effective decoder feature refinement through building the quality-aware correspondence weights. Extensive experiments have demonstrated that our proposed quality alignment mechanism could boost the performance by a great margin over a strong baseline while achieving state-of-the-art performances on public VOS benchmarks.(c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页码:148 / 159
页数:12
相关论文
共 65 条
  • [1] Ba J., 2015, P INT C LEARN REPR, P1
  • [2] One-Shot Video Object Segmentation
    Caelles, S.
    Maninis, K. -K.
    Pont-Tuset, J.
    Leal-Taixe, L.
    Cremers, D.
    Van Gool, L.
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5320 - 5329
  • [3] Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning
    Chen, Yuhua
    Pont-Tuset, Jordi
    Montes, Alberto
    Van Gool, Luc
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1189 - 1198
  • [4] Fast and Accurate Online Video Object Segmentation via Tracking Parts
    Cheng, Jingchun
    Tsai, Yi-Hsuan
    Hung, Wei-Chih
    Wang, Shengjin
    Yang, Ming-Hsuan
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7415 - 7424
  • [5] SegFlow: Joint Learning for Video Object Segmentation and Optical Flow
    Cheng, Jingchun
    Tsai, Yi-Hsuan
    Wang, Shengjin
    Yang, Ming-Hsuan
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 686 - 695
  • [6] Global Contrast Based Salient Region Detection
    Cheng, Ming-Ming
    Mitra, Niloy J.
    Huang, Xiaolei
    Torr, Philip H. S.
    Hu, Shi-Min
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (03) : 569 - 582
  • [7] Unsupervised learning from video to detect foreground objects in single images
    Croitoru, Ioana
    Bogolin, Simion-Vlad
    Leordeanu, Marius
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4345 - 4353
  • [8] Cui Y., 2021, P IEEECVF INT C COMP, P8138
  • [9] SSTVOS: Sparse Spatiotemporal Transformers for Video Object Segmentation
    Duke, Brendan
    Ahmed, Abdalla
    Wolf, Christian
    Aarabi, Parham
    Taylor, Graham W.
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 5908 - 5917
  • [10] The Pascal Visual Object Classes (VOC) Challenge
    Everingham, Mark
    Van Gool, Luc
    Williams, Christopher K. I.
    Winn, John
    Zisserman, Andrew
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) : 303 - 338