State-Aware Tracker for Real-Time Video Object Segmentation

被引:93
作者
Chen, Xi [1 ,2 ]
Li, Zuoxin [2 ]
Yuan, Ye [2 ]
Yu, Gang [2 ]
Shen, Jianxin [1 ]
Qi, Donglian [1 ]
机构
[1] Zhejiang Univ, Coll Elect Engn, Hangzhou, Peoples R China
[2] Megvii Inc, Beijing, Peoples R China
来源
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020) | 2020年
关键词
D O I
10.1109/CVPR42600.2020.00940
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we address the task of semi-supervised video object segmentation (VOS) and explore how to make efficient use of video property to tackle the challenge of semi-supervision. We propose a novel pipeline called StateAware Tracker (SAT), which can produce accurate segmentation results with real-time speed. For higher efficiency, SAT takes advantage of the inter-frame consistency and deals with each target object as a tracklet. For more stable and robust performance over video sequences, SAT gets awareness for each state and makes self-adaptation via two feedback loops. One loop assists SAT in generating more stable tracklets. The other loop helps to construct a more robust and holistic target representation. SAT achieves a promising result of 72.3% J&F mean with 39 FPS on DAVIS2017-Val dataset, which shows a decent trade-off between efficiency and accuracy.
引用
收藏
页码:9381 / 9390
页数:10
相关论文
共 32 条
[1]   CNN in MRF: Video Object Segmentation via Inference in A CNN-Based Higher-Order Spatio-Temporal MRF [J].
Bao, Linchao ;
Wu, Baoyuan ;
Liu, Wei .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5977-5986
[2]   One-Shot Video Object Segmentation [J].
Caelles, S. ;
Maninis, K. -K. ;
Pont-Tuset, J. ;
Leal-Taixe, L. ;
Cremers, D. ;
Van Gool, L. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5320-5329
[3]   Fast and Accurate Online Video Object Segmentation via Tracking Parts [J].
Cheng, Jingchun ;
Tsai, Yi-Hsuan ;
Hung, Wei-Chih ;
Wang, Shengjin ;
Yang, Ming-Hsuan .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7415-7424
[4]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[5]   LaSOT: A High-quality Benchmark for Large-scale Single Object Tracking [J].
Fan, Heng ;
Lin, Liting ;
Yang, Fan ;
Chu, Peng ;
Deng, Ge ;
Yu, Sijia ;
Bai, Hexin ;
Xu, Yong ;
Liao, Chunyuan ;
Ling, Haibin .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5369-5378
[6]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[7]  
Huang Li, 2018, arXiv
[8]   A Generative Appearance Model for End-to-end Video Object Segmentation [J].
Johnander, Joakim ;
Danelljan, Martin ;
Brissman, Emil ;
Khan, Fahad Shahbaz ;
Felsberg, Michael .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :8945-8954
[9]  
Khoreva Anna, 2017, 2017 DAVIS CHALLENGE
[10]   ImageNet Classification with Deep Convolutional Neural Networks [J].
Krizhevsky, Alex ;
Sutskever, Ilya ;
Hinton, Geoffrey E. .
COMMUNICATIONS OF THE ACM, 2017, 60 (06) :84-90