Generalized concept overlay for semantic multi-modal analysis of audio-visual content

被引:0
|
作者
Mezaris, Vasileios [1 ]
Gidaros, Spyros [1 ]
Kompatsiaris, Ioannis [1 ]
机构
[1] Ctr Res & Technol Hellas, Informat & Telemat Inst, Thermi 57001, Greece
关键词
Video analysis; Semantic multi-modal analysis;
D O I
10.1109/SMAP.2009.13
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this work, the problem of performing multimodal analysis of audio-visual streams by effectively combining the results of multiple uni-modal analysis techniques is addressed. A non-learning-based approach is proposed to this end, that takes into account the potential variability of the different uni-modal analysis techniques in terms of the decomposition of the audio-visual stream that they adopt, the concepts of an ontology that they consider, the varying semantic importance of each modality, and other factors. Preliminary results from the application of the proposed approach to broadcast News content reveal its effectiveness.
引用
收藏
页码:27 / 32
页数:6
相关论文
共 50 条
  • [1] Multi-modal audio-visual event recognition for football analysis
    Barnard, M
    Odobez, JM
    Bengio, S
    2003 IEEE XIII WORKSHOP ON NEURAL NETWORKS FOR SIGNAL PROCESSING - NNSP'03, 2003, : 469 - 478
  • [2] Multi-modal authentication system based on audio-visual data
    Debnath, Saswati
    Roy, Pinki
    PROCEEDINGS OF THE 2019 IEEE REGION 10 CONFERENCE (TENCON 2019): TECHNOLOGY, KNOWLEDGE, AND SOCIETY, 2019, : 2507 - 2512
  • [3] Multi-Modal Multi-Correlation Learning for Audio-Visual Speech Separation
    Wang, Xiaoyu
    Kong, Xiangyu
    Peng, Xiulian
    Lu, Yan
    INTERSPEECH 2022, 2022, : 886 - 890
  • [4] USING COMPRESSED AUDIO-VISUAL WORDS FOR MULTI-MODAL SCENE CLASSIFICATION
    Kurcius, Jan J.
    Breckon, Toby P.
    2014 INTERNATIONAL WORKSHOP ON COMPUTATIONAL INTELLIGENCE FOR MULTIMEDIA UNDERSTANDING (IWCIM), 2014,
  • [5] Audio-visual flow - A variational approach to multi-modal flow estimation
    Hamid, R
    Bobick, A
    Yezzi, A
    ICIP: 2004 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1- 5, 2004, : 2563 - 2566
  • [6] Audio-Visual Emotion Recognition System Using Multi-Modal Features
    Handa, Anand
    Agarwal, Rashi
    Kohli, Narendra
    INTERNATIONAL JOURNAL OF COGNITIVE INFORMATICS AND NATURAL INTELLIGENCE, 2021, 15 (04)
  • [7] Audio-Visual Scene Classification Based on Multi-modal Graph Fusion
    Lei, Han
    Chen, Ning
    INTERSPEECH 2022, 2022, : 4157 - 4161
  • [8] Audio-visual Speaker Recognition via Multi-modal Correlated Neural Networks
    Geng, Jiajia
    Liu, Xin
    Cheung, Yiu-ming
    2016 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE WORKSHOPS (WIW 2016), 2016, : 123 - 128
  • [9] A System for the Semantic Multimodal Analysis of News Audio-Visual Content
    Mezaris, Vasileios
    Gidaros, Spyros
    Papadopoulos, Georgios Th.
    Kasper, Walter
    Steffen, Joerg
    Ordelman, Roeland
    Huijbregts, Marijn
    de Jong, Franciska
    Kompatsiaris, Ioannis
    Strintzis, Michael G.
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2010,
  • [10] A System for the Semantic Multimodal Analysis of News Audio-Visual Content
    Vasileios Mezaris
    Spyros Gidaros
    GeorgiosTh Papadopoulos
    Walter Kasper
    Jörg Steffen
    Roeland Ordelman
    Marijn Huijbregts
    Franciska de Jong
    Ioannis Kompatsiaris
    MichaelG Strintzis
    EURASIP Journal on Advances in Signal Processing, 2010