Efficient implementation techniques of an SVM-based speech/music classifier in SMV

被引:9
作者
Lim, Chungsoo [1 ]
Chang, Joon-Hyuk [2 ]
机构
[1] Korea Natl Univ Transportat, Choungju Si, Chungbuk, South Korea
[2] Hanyang Univ, Sch Elect Engn, Seoul 133791, South Korea
关键词
Speech/music classification; Support vector machine; Selectable mode vocoder; Embedded system; SUPPORT VECTOR MACHINE;
D O I
10.1007/s11042-014-1859-8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
For real-time speech and audio encoders used in various multimedia applications, low-complexity encoding algorithms are required. Indeed, accurate classification of input signals is the key prerequisite for variable bit rate encoding, which has been introduced in order to effectively utilize limited communication bandwidth. This paper investigates implementation issues with a support vector machine (SVM)-based speech/music classifier in the selectable mode vocoder (SMV) framework, which is a standard codec adopted by the Third-Generation Partnership Project 2 (3GPP2). While a support vector machine is well known for its superior classification capability, it is accompanied by a high computational cost. In order to achieve a more realizable system, we propose two techniques for the SVM-based speech/music classifier, aimed at reducing the number of classification requests to the classifier. The first technique introduces a simpler classifier that processes some of the input frames instead of the SVM-based classifier, and the second technique skips a portion of input frames based on strong inter-frame correlation in speech and music frames. Our experimental results show that the proposed techniques can reduce the computational cost of the SVM-based classifier by 95.4 % with negligible performance degradation, making it plausible for integration into the SMV codec.
引用
收藏
页码:5375 / 5400
页数:26
相关论文
共 18 条
[1]  
3GPP2 Specification, 2004, 3GPP2CS00300V30
[2]  
[Anonymous], 1986, Proceedings of DARPA Workshop on Speech Recognition
[3]  
Burger D, 1997, 1342 U WISC MAD COMP
[4]  
Burges C. J. C., 1996, Machine Learning. Proceedings of the Thirteenth International Conference (ICML '96), P71
[5]  
CSR, 2006, BLUECORE5 MULT
[6]   Target-shooting exergame with a hand gesture control [J].
Dardas, Nasser H. ;
Silva, Juan M. ;
El Saddik, Abdulmotaleb .
MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 70 (03) :2211-2233
[7]  
Farrugia RA, 2012, IEEE T MULTIMED, V11, P1323
[8]  
Gao Y, 2001, INT CONF ACOUST SPEE, P709, DOI 10.1109/ICASSP.2001.941013
[9]   Classification of defects in steel strip surface based on multiclass support vector machine [J].
Hu, Huijun ;
Li, Yuanxiang ;
Liu, Maofu ;
Liang, Wenhao .
MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 69 (01) :199-216
[10]  
Ji-hyun Song, 2011, 2011 4th International Congress on Image and Signal Processing (CISP 2011), P2182, DOI 10.1109/CISP.2011.6100596