Video Saliency Detection via Spatial-Temporal Fusion and Low-Rank Coherency Diffusion

被引:161
作者
Chen, Chenglizhao [1 ]
Li, Shuai [1 ]
Wang, Yongguang [1 ]
Qin, Hong [2 ]
Hao, Aimin [1 ]
机构
[1] Beihang Univ, State Key Lab Virtual Real Technol & Syst, Beijing 100191, Peoples R China
[2] SUNY Stony Brook, Stony Brook, NY 11794 USA
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
Spatial-temporal saliency fusion; low-rank coherency guided saliency diffusion; video saliency; visual saliency; MOVING OBJECT DETECTION; TRACKING;
D O I
10.1109/TIP.2017.2670143
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper advocates a novel video saliency detection method based on the spatial-temporal saliency fusion and low-rank coherency guided saliency diffusion. In sharp contrast to the conventional methods, which conduct saliency detection locally in a frame-by-frame way and could easily give rise to incorrect low-level saliency map, in order to overcome the existing difficulties, this paper proposes to fuse the color saliency based on global motion clues in a batch-wise fashion. And we also propose low-rank coherency guided spatial-temporal saliency diffusion to guarantee the temporal smoothness of saliency maps. Meanwhile, a series of saliency boosting strategies are designed to further improve the saliency accuracy. First, the original long-term video sequence is equally segmented into many short-term frame batches, and the motion clues of the individual video batch are integrated and diffused temporally to facilitate the computation of color saliency. Then, based on the obtained saliency clues, inter-batch saliency priors are modeled to guide the low-level saliency fusion. After that, both the raw color information and the fused low-level saliency are regarded as the low-rank coherency clues, which are employed to guide the spatial-temporal saliency diffusion with the help of an additional permutation matrix serving as the alternative rank selection strategy. Thus, it could guarantee the robustness of the saliency map's temporal consistence, and further boost the accuracy of the computed saliency map. Moreover, we conduct extensive experiments on five public available benchmarks, and make comprehensive, quantitative evaluations between our method and 16 state-of-the-art techniques. All the results demonstrate the superiority of our method in accuracy, reliability, robustness, and versatility.
引用
收藏
页码:3156 / 3170
页数:15
相关论文
共 46 条
[31]   SuBSENSE: A Universal Change Detection Method With Local Adaptive Sensitivity [J].
St-Charles, Pierre-Luc ;
Bilodeau, Guillaume-Alexandre ;
Bergevin, Robert .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (01) :359-373
[32]   Learning patterns of activity using real-time tracking [J].
Stauffer, C ;
Grimson, WEL .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2000, 22 (08) :747-757
[33]   Motion Coherent Tracking Using Multi-label MRF Optimization [J].
Tsai, David ;
Flagg, Matthew ;
Nakazawa, Atsushi ;
Rehg, James M. .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2012, 100 (02) :190-202
[34]  
Varadarajan S, 2013, 2013 10TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2013), P63, DOI 10.1109/AVSS.2013.6636617
[35]   Robust multi-modal medical image fusion via anisotropic heat diffusion guided low-rank structural analysis [J].
Wang, Qingzheng ;
Li, Shuai ;
Qin, Hong ;
Hao, Aimin .
INFORMATION FUSION, 2015, 26 :103-121
[36]   Static and Moving Object Detection Using Flux Tensor with Split Gaussian Models [J].
Wang, Rui ;
Bunyak, Filiz ;
Seetharaman, Guna ;
Palaniappan, Kannappan .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2014, :420-+
[37]   Consistent Video Saliency Using Local Gradient Flow Optimization and Global Refinement [J].
Wang, Wenguan ;
Shen, Jianbing ;
Shao, Ling .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (11) :4185-4196
[38]  
Wenguan Wang, 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), P3395, DOI 10.1109/CVPR.2015.7298961
[39]  
Wright J., 2009, ADV NEURAL INFORM PR, V58, P289, DOI DOI 10.1109/NNSP.2000.889420
[40]  
Xie Y., 2013, IEEE T IMAGE PROCESS, V22, P314