Near-Duplicate Image Detection in a Visually Salient Riemannian Space

被引：19

作者：

Zheng, Ligang ^{[1
]}

Lei, Yanqiang ^{[1
]}

Qiu, Guoping ^{[2
]}

Huang, Jiwu ^{[1
]}

机构：

[1] Sun Yat Sen Univ, Sch Informat Sci Technol, Guangzhou 510006, Guangdong, Peoples R China

[2] Univ Nottingham, Sch Comp Sci, Nottingham NG8 1BB, England

来源：

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY | 2012年 / 7卷 / 05期

关键词：

Affine-invariant Riemannian metric; logarithm matrix; near-duplicate detection; region covariance; Riemannian manifold; saliency map; visual attention; ATTENTION; SCENES; SCALE;

D O I：

10.1109/TIFS.2012.2206386

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

This paper presents a framework for near-duplicate image detection in a visually salient Riemannian space. A visual saliency model is first used to identify salient regions of the image and then the salient region covariance matrix (SCOV) of various image features is computed. SCOV, which lies in a Riemannian manifold, is used as a robust and compact image content descriptor. An efficient coarse-to-fine Riemannian (CTOFR) image search strategy has been developed to improve efficiency while maintaining accuracy. CTOFR first uses a computationally fast but less accurate log-Euclidean Riemannian metric to do a coarse level search of the entire database and retrieve a subset of likely targets and then uses a computationally expensive but more accurate affine-invariant Riemannian metric to search the returns from the coarse search. We present experimental results to demonstrate that SCOV is a very compact, robust, and discriminative descriptor which is competitive to other state-of-the-art descriptors for near-duplicate image and video detection. We show that CTOFR can yield significant speedups over traditional full search methods without sacrificing accuracy, and that the larger the database the higher the speedup factor.

引用

页码：1578 / 1593

页数：16

共 49 条

[1] DISCRETE COSINE TRANSFORM [J].

AHMED, N ;

NATARAJAN, T ;

RAO, KR .

IEEE TRANSACTIONS ON COMPUTERS, 1974, C 23 (01) :90-93

[2]

Anh CLN, 2012, INT CONF ACOUST SPEE, P1305, DOI 10.1109/ICASSP.2012.6288129

[3]

[Anonymous], P ACM INT C MULT INF

[4]

[Anonymous], 2007, PROC IEEE C COMPUT V, DOI 10.1109/CVPR.2007.383267

[5]

[Anonymous], 2006, Advances in Neural Information Processing Systems

[6]

Arsigny V, 2005, LECT NOTES COMPUT SC, V3749, P115

[7] NIMBLE: A kernel density model of saccade-based visual memory [J].

Barrington, Luke ;

Marks, Tim K. ;

Hsiao, Janet Hui-wen ;

Cottrell, Garrison W. .

JOURNAL OF VISION, 2008, 8 (14)

[8] The ''independent components'' of natural scenes are edge filters [J].

Bell, AJ ;

Sejnowski, TJ .

VISION RESEARCH, 1997, 37 (23) :3327-3338

[9] Saliency, attention, and visual search: An information theoretic approach [J].

Bruce, Neil D. B. ;

Tsotsos, John K. .

JOURNAL OF VISION, 2009, 9 (03)

[10]

Cherian A, 2011, IEEE I CONF COMP VIS, P2399, DOI 10.1109/ICCV.2011.6126523

← 1 2 3 4 5 →