SiamPolar: Semi-supervised realtime video object segmentation with polar representation

被引:6
作者
Li, Yaochen [1 ]
Hong, Yuhui [1 ]
Song, Yonghong [1 ]
Zhu, Chao [1 ]
Zhang, Ying [1 ]
Wang, Ruihao [2 ]
机构
[1] Xi An Jiao Tong Univ, Sch Software Engn, Xian, Shaanxi, Peoples R China
[2] MEGVII Technol, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Video object track; Realtime video segmentation; Polar representation; Siamese network;
D O I
10.1016/j.neucom.2021.09.063
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video object segmentation (VOS) is an essential part of autonomous vehicle navigation. Besides the accuracy metric, the real-time speed is very important for the algorithms of autonomous vehicles. In this paper, we propose a semi-supervised real-time method based on the Siamese network using a new polar representation. The input of bounding boxes are initialized rather than the object masks, which are applied to the video object detection tasks. The polar representation could reduce the parameters for encoding masks with subtle accuracy loss, so that the algorithm speed can be improved significantly. An asymmetric siamese network is also developed to extract the features from different spatial scales. Moreover, the peeling convolution is proposed to reduce the antagonism among the branches of the polar head. The repeated cross-correlation and semi-FPN are designed based on this idea. The experimental results on the DAVIS2016 dataset and other public datasets demonstrate the effectiveness of the proposed method. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:491 / 503
页数:13
相关论文
共 45 条
[1]   Semi-Supervised Video Segmentation Using Tree Structured Graphical Models [J].
Badrinarayanan, Vijay ;
Budvytis, Ignas ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (11) :2751-2764
[2]   CNN in MRF: Video Object Segmentation via Inference in A CNN-Based Higher-Order Spatio-Temporal MRF [J].
Bao, Linchao ;
Wu, Baoyuan ;
Liu, Wei .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5977-5986
[3]   Fully-Convolutional Siamese Networks for Object Tracking [J].
Bertinetto, Luca ;
Valmadre, Jack ;
Henriques, Joao F. ;
Vedaldi, Andrea ;
Torr, Philip H. S. .
COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 :850-865
[4]   One-Shot Video Object Segmentation [J].
Caelles, S. ;
Maninis, K. -K. ;
Pont-Tuset, J. ;
Leal-Taixe, L. ;
Cremers, D. ;
Van Gool, L. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5320-5329
[5]  
Chen Kai, 2019, arXiv:1906.07155
[6]   Multilevel Model for Video Object Segmentation Based on Supervision Optimization [J].
Chen, Yadang ;
Hao, Chuanyan ;
Liu, Alex X. ;
Wu, Enhua .
IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (08) :1934-1945
[7]   Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning [J].
Chen, Yuhua ;
Pont-Tuset, Jordi ;
Montes, Alberto ;
Van Gool, Luc .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1189-1198
[8]   Fast and Accurate Online Video Object Segmentation via Tracking Parts [J].
Cheng, Jingchun ;
Tsai, Yi-Hsuan ;
Hung, Wei-Chih ;
Wang, Shengjin ;
Yang, Ming-Hsuan .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7415-7424
[9]   JumpCut: Non-Successive Mask Transfer and Interpolation for Video Cutout [J].
Fan, Qingnan ;
Zhong, Fan ;
Lischinski, Dani ;
Cohen-Or, Daniel ;
Chen, Baoquan .
ACM TRANSACTIONS ON GRAPHICS, 2015, 34 (06)
[10]   Rich feature hierarchies for accurate object detection and semantic segmentation [J].
Girshick, Ross ;
Donahue, Jeff ;
Darrell, Trevor ;
Malik, Jitendra .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587