Discovering Video Clusters from Visual Features and Noisy Tags

被引：0

作者：

Vahdat, Arash ^{[1
]}

Zhou, Guang-Tong ^{[1
]}

Mori, Greg ^{[1
]}

机构：

[1] Simon Fraser Univ, Sch Comp Sci, Burnaby, BC V5A 1S6, Canada

来源：

COMPUTER VISION - ECCV 2014, PT VI | 2014年 / 8694卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present an algorithm for automatically clustering tagged videos. Collections of tagged videos are commonplace, however, it is not trivial to discover video clusters therein. Direct methods that operate on visual features ignore the regularly available, valuable source of tag information. Solely clustering videos on these tags is error-prone since the tags are typically noisy. To address these problems, we develop a structured model that considers the interaction between visual features, video tags and video clusters. We model tags from visual features, and correct noisy tags by checking visual appearance consistency. In the end, videos are clustered from the refined tags as well as the visual features. We learn the clustering through a max-margin framework, and demonstrate empirically that this algorithm can produce more accurate clustering results than baseline methods based on tags or visual features, or both. Further, qualitative results verify that the clustering results can discover sub-categories and more specific instances of a given video category.

引用

页码：526 / 539

页数：14

共 50 条

[1] Discovering popular and persistent tags from YouTube trending video big dataset
Yesim Dokuz
Multimedia Tools and Applications, 2024, 83 : 10779 - 10797
[2] Discovering popular and persistent tags from YouTube trending video big dataset
Dokuz, Yesim
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (04) : 10779 - 10797
[3] Discovering visual concept structure with sparse and incomplete tags
Wang, Jingya
Zhu, Xiatian
Gong, Shaogang
ARTIFICIAL INTELLIGENCE, 2017, 250 : 16 - 36
[4] MULTIMODAL PLSA ON VISUAL FEATURES AND TAGS
Romberg, Stefan
Hoerster, Eva
Lienhart, Rainer
ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 414 - 417
[5] Coding Visual Features Extracted From Video Sequences
Baroffio, Luca
Cesana, Matteo
Redondi, Alessandro
Tagliasacchi, Marco
Tubaro, Stefano
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (05) : 2262 - 2276
[6] Tags2Parts: Discovering Semantic Regions from Shape Tags
Muralikrishnan, Sanjeev
Kim, Vladimir G.
Chaudhuri, Siddhartha
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2926 - 2935
[7] OBSERVER EFFICIENCY FOR FEATURES ON NOISY VISUAL IMAGES
SWENSSON, RG
JUDY, PF
BULLETIN OF THE PSYCHONOMIC SOCIETY, 1986, 24 (05) : 342 - 342
[8] Discovering joint audio–visual codewords for video event detection
I-Hong Jhuo
Guangnan Ye
Shenghua Gao
Dong Liu
Yu-Gang Jiang
D. T. Lee
Shih-Fu Chang
Machine Vision and Applications, 2014, 25 : 33 - 47
[9] CODING VIDEO SEQUENCES OF VISUAL FEATURES
Baroffio, Luca
Cesana, Matteo
Redondi, Alessandro
Tubaro, Stefano
Tagliasacchi, Marco
2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 1895 - 1899
[10] Video Summarization with Visual and Semantic Features
Dong, Pei
Wang, Zhiyong
Zhuo, Li
Feng, Dagan
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING-PCM 2010, PT I, 2010, 6297 : 203 - +

← 1 2 3 4 5 →