Video object segmentation based on motion-aware ROI prediction and adaptive reference updating

被引:7
作者
Fu, Lihua [1 ]
Zhao, Yu [1 ,2 ]
Sun, Xiaowei [1 ]
Huang, Jialiang [1 ]
Wang, Dan [1 ]
Ding, Yu [1 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing, Peoples R China
[2] Beihang Univ, Sch Comp Sci & Engn, Beijing, Peoples R China
关键词
Video object segmentation; Region of interest prediction; Adaptive reference updating; Siamese network;
D O I
10.1016/j.eswa.2020.114153
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video object segmentation (VOS) is a research hotspot in the field of computer vision. Traditional video object segmentation methods based on deep learning have some problems such as difficulty in adapting to the change of object appearance and low segmentation speed. In this manuscript, we propose a robust VOS method based on motion-aware region of interest (ROI) prediction and adaptive reference updating. Firstly, based on the historical movement trajectory of target region to perceive motion trend dynamically, we predict the motion-aware ROI of target object in the current frame and use it as the input of segmentation network. Then, in order to adapt to the appearance changes of target in the video, the adaptive updating strategy of reference is given to dynamically update the reference frame during the segmentation process. Finally, VOS Siamese network is designed for fast segmentation. Experiments on three public benchmark datasets, DAVIS-2016 and DAVIS-2017, show that the proposed method performs better than the state-of-the-art approaches.
引用
收藏
页数:13
相关论文
共 38 条
  • [1] The Lovasz-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks
    Berman, Maxim
    Triki, Amal Rannen
    Blaschko, Matthew B.
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4413 - 4421
  • [2] Fully-Convolutional Siamese Networks for Object Tracking
    Bertinetto, Luca
    Valmadre, Jack
    Henriques, Joao F.
    Vedaldi, Andrea
    Torr, Philip H. S.
    [J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 : 850 - 865
  • [3] One-Shot Video Object Segmentation
    Caelles, S.
    Maninis, K. -K.
    Pont-Tuset, J.
    Leal-Taixe, L.
    Cremers, D.
    Van Gool, L.
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5320 - 5329
  • [4] Caelles S., 2017, ARXIV170401926
  • [5] Caelles Sergi, 2018, ARXIV170400675
  • [6] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [7] Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning
    Chen, Yuhua
    Pont-Tuset, Jordi
    Montes, Alberto
    Van Gool, Luc
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1189 - 1198
  • [8] Fast and Accurate Online Video Object Segmentation via Tracking Parts
    Cheng, Jingchun
    Tsai, Yi-Hsuan
    Hung, Wei-Chih
    Wang, Shengjin
    Yang, Ming-Hsuan
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7415 - 7424
  • [9] SegFlow: Joint Learning for Video Object Segmentation and Optical Flow
    Cheng, Jingchun
    Tsai, Yi-Hsuan
    Wang, Shengjin
    Yang, Ming-Hsuan
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 686 - 695
  • [10] Xception: Deep Learning with Depthwise Separable Convolutions
    Chollet, Francois
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1800 - 1807