Effective Multimodality Fusion Framework for Cross-Media Topic Detection

被引:32
作者
Chu, Lingyang [1 ]
Zhang, Yanyan [2 ]
Li, Guorong [2 ]
Wang, Shuhui [1 ]
Zhang, Weigang [3 ]
Huang, Qingming [1 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100080, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100080, Peoples R China
[3] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
Cross-media; fusion; multimodality; topic detection; topic recovery (TR); We-Media; TRACKING; DISCOVERY;
D O I
10.1109/TCSVT.2014.2347551
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Due to the prevalence of We-Media, information is quickly published and received in various forms anywhere and anytime through the Internet. The rich cross-media information carried by the multimodal data in multiple media has a wide audience, deeply reflects the social realities, and brings about much greater social impact than any single media information. Therefore, automatically detecting topics from cross media is of great benefit for the organizations (i.e., advertising agencies and governments) that care about the social opinions. However, cross-media topic detection is challenging from the following aspects: 1) the multimodal data from different media often involve distinct characteristics and 2) topics are presented in an arbitrary manner among the noisy web data. In this paper, we propose a multimodality fusion framework and a topic recovery (TR) approach to effectively detect topics from cross-media data. The multimodality fusion framework flexibly incorporates the heterogeneous multimodal data into a multimodality graph, which takes full advantage from the rich cross-media information to effectively detect topic candidates (T.C.). The TR approach solidly improves the entirety and purity of detected topics by: 1) merging the T.C. that are highly relevant themes of the same real topic and 2) filtering out the less-relevant noise data in the merged T.C. Extensive experiments on both single-media and cross-media data sets demonstrate the promising flexibility and effectiveness of our method in detecting topics from cross media.
引用
收藏
页码:556 / 569
页数:14
相关论文
共 52 条
[1]  
Allan J., 2002, INTRO TOPIC DETECTIO, DOI DOI 10.1007/978-1-4615-0933-21
[2]   On-Line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking [J].
AlSumait, Loulwah ;
Barbara, Daniel ;
Domeniconi, Carlotta .
ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, :3-12
[3]   A document clustering algorithm for discovering and describing topics [J].
Anaya-Sanchez, Henry ;
Pons-Porrata, Aurora ;
Berlanga-Llavori, Rafael .
PATTERN RECOGNITION LETTERS, 2010, 31 (06) :502-510
[4]  
[Anonymous], 2007, NIPS
[5]  
[Anonymous], 1998, P BROADC NEWS TRANSC
[6]  
[Anonymous], 2010, PROC 27 INT C INT C
[7]  
[Anonymous], 2010, P 16 ACM SIGKDD INT
[8]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[9]   Graph-based quadratic optimization: A fast evolutionary approach [J].
Bulo, Samuel Rota ;
Pelillo, Marcello ;
Bomze, Immanuel M. .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2011, 115 (07) :984-995
[10]  
Cao J., 2009, Beijing: Inst. Comput. Technol., V10, P324