Mixtures of probability experts for audio retrieval and indexing

被引:19
作者
Slaney, M [1 ]
机构
[1] IBM Almaden Res Ctr, San Jose, CA 95120 USA
来源
IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I AND II, PROCEEDINGS | 2002年
关键词
D O I
10.1109/ICME.2002.1035789
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper describes a system for connecting non-speech sounds and words using linked multi-dimensional vector spaces. An approach based on mixture of experts learns the mapping between one space and the other. This paper describes the conversion of audio and semantic data into their respective vector spaces. Two different mixture-of-probability-expert models are trained to learn the association between acoustic queries and the corresponding semantic explanation, and visa versa. Test results are presented based on commercial sound effects CDs.
引用
收藏
页码:345 / 348
页数:4
相关论文
共 8 条
[1]  
[Anonymous], TR1098 HARV U CTR RE
[2]  
Barnard K, 2001, EIGHTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOL II, PROCEEDINGS, P408, DOI 10.1109/ICCV.2001.937654
[3]  
FLICKNER M, 1993, SPIE STORAGE RETRIEV, P173
[4]  
Jain K, 1988, Algorithms for clustering data
[5]  
Nigam K, 1998, FIFTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-98) AND TENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICAL INTELLIGENCE (IAAI-98) - PROCEEDINGS, P792
[6]  
Quatieri T., 2001, DISCRETE TIME SPEECH
[7]  
SLANEY M, 2002, P 2002 IEEE ICASSP O
[8]  
WATERHOUSE S, 1997, THESIS U CAMBRIDGE