M-MUSICS: an intelligent mobile music retrieval system

被引:2
作者
Rho, Seungmin [2 ]
Hwang, Eenjun [2 ]
Park, Jong Hyuk [1 ]
机构
[1] Seoul Natl Univ Sci & Technol, Dept Comp Sci & Engn, Seoul, South Korea
[2] Korea Univ, Sch Elect Engn, Seoul, South Korea
关键词
Content-based audio retrieval; Mobile platform; Relevance feedback; Signal processing; RELEVANCE FEEDBACK; IMAGE RETRIEVAL; SIMILARITY;
D O I
10.1007/s00530-010-0212-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Accurate voice humming transcription and efficient indexing and retrieval schemes are essential to a large-scale humming-based audio retrieval system. Although much research has been done to develop such schemes, their performance in terms of precision, recall, and F-measure, among all similarity metrics, are still unsatisfactory. In this paper, we propose a new voice query transcription scheme. It considers the following features: note onset detection using dynamic threshold methods, fundamental frequency (F0) acquisition of each frame, and frequency realignment using K-means. We use a popularity-adaptive indexing structure called frequently accessed index (FAI) based on frequently queried tunes for indexing purposes. In addition, we propose a semi-supervised relevance feedback and query reformulation scheme based on a genetic algorithm to improve retrieval efficiency. In this paper, we extend our efforts to mobile multimedia environments and develop a mobile audio retrieval system. Experiments show our system performs satisfactory in wireless mobile multimedia environments.
引用
收藏
页码:313 / 326
页数:14
相关论文
共 33 条
[1]  
Barrington L, 2007, INT CONF ACOUST SPEE, P725
[2]  
BATLLE E, 2004, ISCCSP, P731
[3]   A tutorial on onset detection in music signals [J].
Bello, JP ;
Daudet, L ;
Abdallah, S ;
Duxbury, C ;
Davies, M ;
Sandler, MB .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (05) :1035-1047
[4]  
CHEN L, 2007, IMPLEMENTATION WEB B, P1467
[5]  
*DIG EAR, REAL TIM WAV MIDI CO
[6]  
DUXBURY C, 2003, DAFX
[7]  
ERIC L, 2005, REAL TIME TIME DOMAI
[8]  
FOOTE J, 2003, TREEQ PACKAGE
[9]  
GAGLIARDI L, 2005, ACM C HYP HYP, P248
[10]  
GAINZA, 2004, P IR SYST SIGN C