Recognizing high-level audio-visual concepts using context

被引:0
|
作者
Naphade, MR [1 ]
Huang, TS [1 ]
机构
[1] Univ Illinois, Dept Elect & Comp Engn, Coordinated Sci Lab, Urbana, IL 61801 USA
来源
2001 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS | 2001年
关键词
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Recognition of high-level semantics from audio-visual data is a challenging multimedia understanding problem The difficulty mainly lies in the gap that exists between low level media features and high level semantic concepts In an attempt to bridge this gap we proposed a probabilistic framework for semantic understanding [6, 5] The components of this framework are probabilistic multimedia objects and a graphical network of such objects In this paper we show how the framework supports detection of multiple high-level concepts, which enjoy spatial and temporal support More importantly, we show why context matters and how it can be modeled Using a factor graph framework, we model context and use it to improve detection of sites, objects and events Using concepts Outdoor and flying-helicopter we demonstrate how the factor graph multinet models context Using ROC curves and probability of error curves we support the intuition that context should help.
引用
收藏
页码:46 / 49
页数:4
相关论文
共 50 条
  • [31] IMPROVING ACOUSTIC MODELING USING AUDIO-VISUAL SPEECH
    Abdelaziz, Ahmed Hussen
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 1081 - 1086
  • [32] Human interaction categorization by using audio-visual cues
    Marin-Jimenez, M. J.
    Munoz-Salinas, R.
    Yeguas-Bolivar, E.
    Perez de la Blanca, N.
    MACHINE VISION AND APPLICATIONS, 2014, 25 (01) : 71 - 84
  • [33] VIDEO CAMERA IDENTIFICATION USING AUDIO-VISUAL FEATURES
    Milani, S.
    Cuccovillo, L.
    Tagliasacchi, M.
    Tubaro, S.
    Aichroth, P.
    2014 5TH EUROPEAN WORKSHOP ON VISUAL INFORMATION PROCESSING (EUVIP 2014), 2014,
  • [34] VOICE ACTIVITY DETECTION USING AUDIO-VISUAL INFORMATION
    Petsatodis, Theodoros
    Pnevmatikakis, Aristodemos
    Boukis, Christos
    2009 16TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, VOLS 1 AND 2, 2009, : 216 - +
  • [35] Joint audio-visual tracking using particle filters
    Zotkin, DN
    Duraiswami, R
    Davis, LS
    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2002, 2002 (11) : 1154 - 1164
  • [36] MULTIMEDIA PRESENTATION DEVELOPMENT USING THE AUDIO-VISUAL CONNECTION
    MOORE, DJ
    IBM SYSTEMS JOURNAL, 1990, 29 (04) : 494 - 508
  • [37] Human interaction categorization by using audio-visual cues
    M. J. Marín-Jiménez
    R. Muñoz-Salinas
    E. Yeguas-Bolivar
    N. Pérez de la Blanca
    Machine Vision and Applications, 2014, 25 : 71 - 84
  • [38] Vehicle Detection and Classification using Audio-Visual cues
    Piyush, P.
    Rajan, Rajeev
    Mary, Leena
    Koshy, Bino I.
    2016 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2016, : 732 - 736
  • [39] Audio-Visual Group Recognition Using Diffusion Maps
    Keller, Yosi
    Coifman, Ronald R.
    Lafon, Stephane
    Zucker, Steven W.
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2010, 58 (01) : 403 - 413
  • [40] Audio-visual speaker localization using graphical models
    Kushal, Akash
    Rahurkar, Mandar
    Li Fei-Fei
    Ponce, Jean
    Huang, Thomas
    18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2006, : 291 - +