Video Object Segmentation with Joint Re-identification and Attention-Aware Mask Propagation

被引：204

作者：

Li, Xiaoxiao ^{[1
]}

Loy, Chen Change ^{[2
]}

机构：

[1] Chinese Univ Hong Kong, Dept Informat Engn, Hong Kong, Peoples R China

[2] Nanyang Technol Univ, Singapore, Singapore

来源：

COMPUTER VISION - ECCV 2018, PT III | 2018年 / 11207卷

关键词：

D O I：

10.1007/978-3-030-01219-9_6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The problem of video object segmentation can become extremely challenging when multiple instances co-exist. While each instance may exhibit large scale and pose variations, the problem is compounded when instances occlude each other causing failures in tracking. In this study, we formulate a deep recurrent network that is capable of segmenting and tracking objects in video simultaneously by their temporal continuity, yet able to re-identify them when they re-appear after a prolonged occlusion. We combine temporal propagation and re-identification functionalities into a single framework that can be trained end-to-end. In particular, we present a re-identification module with template expansion to retrieve missing objects despite their large appearance changes. In addition, we contribute an attention-based recurrent mask propagation approach that is robust to distractors not belonging to the target segment. Our approach achieves a new state-of-the-art G-mean of 68.2 on the challenging DAVIS 2017 benchmark (test-dev set), outperforming the winning solution. Project Page: http://mmlab.ie.cuhk.edu.hk/projects/DyeNet/.

引用

页码：93 / 110

页数：18