Video object tracking and segmentation with box annotation

被引:3
作者
Wang, Ye [1 ]
Choi, Jongmoo [1 ]
Zhang, Kaitai [1 ]
Huang, Qin [2 ]
Chen, Yueru [1 ]
Lee, Ming-Sui [3 ]
Kuo, C-C Jay [1 ]
机构
[1] Univ Southern Calif, Los Angeles, CA 90007 USA
[2] Facebook, Menlo Pk, CA USA
[3] Natl Taiwan Univ, Taipei, Taiwan
关键词
Video object tracking; Video object segmentation; Reverse optimization; Bounding box annotation;
D O I
10.1016/j.image.2020.115858
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents a two-stage approach, track and then segment, to perform semi-supervised video object segmentation (VOS) with only bounding box annotations. The proposed reverse optimization for VOS (ROVOS) which leverages a fully convolutional Siamese network performs tracking and segmentation in the tracker. The segmentation cues are able to reversely optimize the location of the tracker and the object segmentation masks are produced by the two-branch system online. The experimental results on DAVIS 2016 and DAVIS 2017 demonstrate significant improvements of the proposed algorithm over the state-of-the-art methods.
引用
收藏
页数:6
相关论文
共 38 条
[1]  
[Anonymous], 2019, ARXIV190209513
[2]  
[Anonymous], 2018, ARXIV181205050
[3]   CNN in MRF: Video Object Segmentation via Inference in A CNN-Based Higher-Order Spatio-Temporal MRF [J].
Bao, Linchao ;
Wu, Baoyuan ;
Liu, Wei .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5977-5986
[4]   Fully-Convolutional Siamese Networks for Object Tracking [J].
Bertinetto, Luca ;
Valmadre, Jack ;
Henriques, Joao F. ;
Vedaldi, Andrea ;
Torr, Philip H. S. .
COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 :850-865
[5]   One-Shot Video Object Segmentation [J].
Caelles, S. ;
Maninis, K. -K. ;
Pont-Tuset, J. ;
Leal-Taixe, L. ;
Cremers, D. ;
Van Gool, L. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5320-5329
[6]   CaMap: Camera-based Map Manipulation on Mobile Devices [J].
Chen, Liang ;
Chen, Dongyi .
PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE2018), 2018,
[7]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[8]   Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning [J].
Chen, Yuhua ;
Pont-Tuset, Jordi ;
Montes, Alberto ;
Van Gool, Luc .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1189-1198
[9]   Fast and Accurate Online Video Object Segmentation via Tracking Parts [J].
Cheng, Jingchun ;
Tsai, Yi-Hsuan ;
Hung, Wei-Chih ;
Wang, Shengjin ;
Yang, Ming-Hsuan .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7415-7424
[10]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848