Crowdsourced Time-sync Video Tagging using Temporal and Personalized Topic Modeling

被引:45
作者
Wu, Bin [1 ]
Zhong, Erheng [1 ]
Tan, Ben [1 ]
Horner, Andrew [1 ]
Yang, Qiang [1 ,2 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Hong Kong, Peoples R China
[2] Huawei, Noahs Ark Lab, Hong Kong, Peoples R China
来源
PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14) | 2014年
关键词
Video tagging; crowdsourcing; topic modeling; temporal and personalized model;
D O I
10.1145/2623330.2623625
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Time-sync video tagging aims to automatically generate tags for each video shot. It can improve the user's experience in previewing a video's timeline structure compared to traditional schemes that tag an entire video clip. In this paper, we propose a new application which extracts time-sync video tags by automatically exploiting crowdsourced comments from video websites such as Nico Nico Douga, where videos are commented on by online crowd users in a time-sync manner. The challenge of the proposed application is that users with bias interact with one another frequently and bring noise into the data, while the comments are too sparse to compensate for the noise. Previous techniques are unable to handle this task well as they consider video semantics independently, which may overfit the sparse comments in each shot and thus fail to provide accurate modeling. To resolve these issues, we propose a novel temporal and personalized topic model that jointly considers temporal dependencies between video semantics, users' interaction in commenting, and users' preferences as prior knowledge. Our proposed model shares knowledge across video shots via users to enrich the short comments, and peels off user interaction and user bias to solve the noisy-comment problem. Log-likelihood analyses and user studies on large datasets show that the proposed model outperforms several state-of-the-art baselines in video tagging quality. Case studies also demonstrate our model's capability of extracting tags from the crowdsourced short and noisy comments.
引用
收藏
页码:721 / 730
页数:10
相关论文
共 25 条
  • [1] [Anonymous], 2009, P 3 ACM C RECOMMENDE, DOI DOI 10.1145/1639714.1639726
  • [2] [Anonymous], 2013, White Paper
  • [3] [Anonymous], 2013, YOUTUBE STAT
  • [4] Bachrach Yoram, 2012, ARXIV12066386
  • [5] Bak D.K. JinYeong., 2012, NIPS Workshop on Big Learning, P1
  • [6] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
  • [7] Chakrabarti D., 2011, ICWSM
  • [8] Tagging Webcast Text in Baseball Videos by Video Segmentation and Text Alignment
    Chiu, Chih-Yi
    Lin, Po-Chih
    Li, Sheng-Yang
    Tsai, Tsung-Han
    Tsai, Yu-Lung
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2012, 22 (07) : 999 - 1013
  • [9] Das A, 2013, 19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), P500
  • [10] Using Social Networking and Collections to Enable Video Semantics Acquisition
    Davis, Stephen J.
    Ritz, Christian H.
    Burnett, Ian S.
    [J]. IEEE MULTIMEDIA, 2009, 16 (04) : 52 - 60