Enhanced Multimedia Content Access and Exploitation using Semantic Speech Retrieval

被引:4
作者
Ordelman, Roeland [1 ]
de Jong, Franciska [2 ]
Larson, Martha [3 ]
机构
[1] Netherlands Inst Sound & Vis, Hilversum, Netherlands
[2] Univ Twente, Human Media Interact, POB 217, NL-7500 AE Enschede, Netherlands
[3] Delft Univ Technol, Dept Mediamat, NL-2600 AA Delft, Netherlands
来源
2009 IEEE THIRD INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2009) | 2009年
关键词
speech retrieval; spoken content; speech recognition; multimedia retrieval and access; semantics;
D O I
10.1109/ICSC.2009.80
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Techniques for automatic annotation of spoken content making use of speech recognition technology have long been characterized as holding unrealized promise to provide access to archives inundated with undisclosed multimedia material. This paper provides an overview of techniques and trends in semantic speech retrieval, which is taken to encompass all approaches offering meaning-based access to spoken word collections. We present descriptions, examples and insights for current techniques, including facing real-world heterogenity, aligning parallel resources and exploiting collateral collections. We also discuss ways in which speech recognition technology can be used to create multimedia connections that make new modes of access available to users. We conclude with an overview of the challenges for semantic speech retrieval in the workflow of a real-world archive and perspectives on future tasks in which speech retrieval integrates information related to affect and appeal, dimensions that transcend topic.
引用
收藏
页码:521 / +
页数:2
相关论文
共 21 条
[1]  
[Anonymous], 2008, Segmentation, diarization and speech transcription: surprise data unraveled
[2]  
[Anonymous], INT C IM VID RETR
[3]  
DEJONG FMG, 2005, RANLP WORKSH CROSS B, P64
[4]  
GAROFOLO JS, 2000, 8 TEXT RETR C WASH, P107
[5]  
HEEREN W, 2007, P 30 ACM SIGR
[6]  
HUIJBREGTS M, 2007, P SAMT 2007, V4816
[7]  
JONATHAN DP, 1998, P DARPA BROADC NEWS, P5
[8]  
KOUMPIS K, 2005, SIGNAL PROCESSING MA, V22
[9]  
LARSON M, 2008, WORK NOT CLEF 2008 W
[10]   Spoken document understanding and organization [J].
Lee, LS ;
Chen, B .
IEEE SIGNAL PROCESSING MAGAZINE, 2005, 22 (05) :42-60