Multimodal Information Fusion for Semantic Video Analysis

被引:1
|
作者
Gulen, Elvan [1 ]
Yilmaz, Turgay [1 ,2 ]
Yazici, Adnan [1 ]
机构
[1] Middle East Tech Univ, Dept Comp Engn, Ankara, Turkey
[2] Univ Tokyo, Inst Ind Sci, Tokyo, Japan
关键词
Concept Interactions; Multimedia Content Analysis; Multimedia Information; Multimodal Fusion; Semantic Concept Detection;
D O I
10.4018/jmdem.2012100103
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Multimedia data by its very nature contains multimodal information in it. For a successful analysis of multimedia content, all available multimodal information should be utilized. Additionally, since concepts can contain valuable cues about other concepts, concept interaction is a crucial source of multimedia information and helps to increase the fusion performance. The aim of this study is to show that integrating existing modalities along with the concept interactions can yield a better performance in detecting semantic concepts. Therefore, in this paper, the authors present a multimodal fusion approach that integrates semantic information obtained from various modalities along with additional semantic cues. The experiments conducted on TRECVID 2007 and CCV Database datasets validates the superiority of such combination over best single modality and alternative modality combinations. The results show that the proposed fusion approach provides 16.7% relative performance gain on TRECVID dataset and 47.7% relative performance improvement on CCV database over the results of best unimodal approaches.
引用
收藏
页码:52 / 74
页数:23
相关论文
共 50 条
  • [1] COMBINING MULTIMODAL AND TEMPORAL CONTEXTUAL INFORMATION FOR SEMANTIC VIDEO ANALYSIS
    Papadopoulos, Georgios Th.
    Mezaris, Vasileios
    Kompatsiaris, Ioannis
    Strintzis, Michael G.
    2009 16TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-6, 2009, : 4325 - +
  • [2] Semantic based information fusion in a multimodal interface
    Russ, G
    Sallans, B
    Hareter, H
    HCI '05: PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON HUMAN-COMPUTER INTERACTION, 2005, : 94 - 100
  • [3] Multimodal information fusion for video concept detection
    Wu, Y
    Lin, CK
    Chang, EY
    Smith, JR
    ICIP: 2004 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1- 5, 2004, : 2391 - 2394
  • [4] Fusion of Multimodal Information for Video Comment Text Sentiment Analysis Methods
    Han, Jing
    Lv, Jinghua
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2025, 16 (01) : 266 - 274
  • [5] News video retrieval by learning multimodal semantic information
    Yu, Hui
    Su, Bolan
    Lu, Hong
    Xue, Xiangyang
    ADVANCES IN VISUAL INFORMATION SYSTEMS, 2007, 4781 : 403 - 414
  • [6] Multimodal Semantic Analysis and Annotation for Basketball Video
    Song Liu
    Min Xu
    Haoran Yi
    Liang-Tien Chia
    Deepu Rajan
    EURASIP Journal on Advances in Signal Processing, 2006
  • [7] Multimodal semantic analysis and annotation for basketball video
    Liu, Song
    Xu, Min
    Yi, Haoran
    Chia, Liang-Tien
    Rajan, Deepu
    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2006, 2006 (1) : 1 - 13
  • [8] Multimodal feature extraction and fusion for semantic mining of soccer video: a survey
    Payam Oskouie
    Sara Alipour
    Amir-Masoud Eftekhari-Moghadam
    Artificial Intelligence Review, 2014, 42 : 173 - 210
  • [9] Multimodal feature extraction and fusion for semantic mining of soccer video: a survey
    Oskouie, Payam
    Alipour, Sara
    Eftekhari-Moghadam, Amir-Masoud
    ARTIFICIAL INTELLIGENCE REVIEW, 2014, 42 (02) : 173 - 210
  • [10] Denoising Bottleneck with Mutual Information Maximization for Video Multimodal Fusion
    Wu, Shaoxiang
    Dai, Damai
    Qin, Ziwei
    Liu, Tianyu
    Lin, Binghuai
    Cao, Yunbo
    Sui, Zhifang
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 2231 - 2243