Distortion discriminant analysis for audio fingerprinting

被引:73
作者
Burges, CJC [1 ]
Platt, JC
Jana, S
机构
[1] Microsoft Res, Redmond, WA 98052 USA
[2] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL 61801 USA
来源
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 2003年 / 11卷 / 03期
关键词
audio fingerprinting; dimensional reduction; robust feature extraction;
D O I
10.1109/TSA.2003.811538
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Mapping audio data to feature vectors for the classification, retrieval or identification tasks presents four principal challenges. The dimensionality of the input must be significantly reduced; the resulting features must be robust to likely distortions of the input; the features must be informative for the task at hand; and the feature extraction operation must be computationally efficient. In this paper, we propose Distortion Discriminant Analysis (DDA), which fulfills all four of these requirements. DDA constructs a linear, convolutional neural network out of layers, each of which performs an oriented PCA dimensional reduction. We demonstrate the effectiveness of DDA on two audio fingerprinting tasks: searching for 500 audio clips in 36 h of audio test data; and playing over 10 days of audio against a database with approximately 240000 fingerprints. We show that the system is robust to kinds of poise that are not present in the training procedure. In the large test. the system gives a false positive rate of 1.5 x 10(-8) per audio clip, per fingerprint, at a false negative rate of 0.2 % per clip.
引用
收藏
页码:165 / 174
页数:10
相关论文
共 12 条
[1]  
Burges CJC, 2002, INT CONF ACOUST SPEE, P1021
[2]  
Cover T. M., 2005, ELEM INF THEORY, DOI 10.1002/047174882X
[3]  
Diamantaras KI, 1996, Principal Component Neural Networks: Theory and Applications
[4]   Content-based retrieval of music and audio [J].
Foote, JT .
MULTIMEDIA STORAGE AND ARCHIVING SYSTEMS II, 1997, 3229 :138-147
[5]  
HAITSMA J, 2001, P 2 INT WORKSH CONT
[6]  
Hart P.E., 1973, Pattern recognition and scene analysis
[7]   Robust matching of audio signals using spectral flatness features [J].
Herre, J ;
Allamanche, E ;
Hellmuth, O .
PROCEEDINGS OF THE 2001 IEEE WORKSHOP ON THE APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, 2001, :127-130
[8]  
Lu L, 2001, ROBUST AUDIO CLASSIF
[9]  
MALVAR H, 1999, P IEEE INT C AC SPEE
[10]  
MALVAR HS, 2001, AUDIO ANECDOTES