Efficient implementation techniques of an SVM-based speech/music classifier in SMV

被引：9

作者：

Lim, Chungsoo ^{[1
]}

Chang, Joon-Hyuk ^{[2
]}

机构：

[1] Korea Natl Univ Transportat, Choungju Si, Chungbuk, South Korea

[2] Hanyang Univ, Sch Elect Engn, Seoul 133791, South Korea

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2015年 / 74卷 / 15期

关键词：

Speech/music classification; Support vector machine; Selectable mode vocoder; Embedded system; SUPPORT VECTOR MACHINE;

D O I：

10.1007/s11042-014-1859-8

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

For real-time speech and audio encoders used in various multimedia applications, low-complexity encoding algorithms are required. Indeed, accurate classification of input signals is the key prerequisite for variable bit rate encoding, which has been introduced in order to effectively utilize limited communication bandwidth. This paper investigates implementation issues with a support vector machine (SVM)-based speech/music classifier in the selectable mode vocoder (SMV) framework, which is a standard codec adopted by the Third-Generation Partnership Project 2 (3GPP2). While a support vector machine is well known for its superior classification capability, it is accompanied by a high computational cost. In order to achieve a more realizable system, we propose two techniques for the SVM-based speech/music classifier, aimed at reducing the number of classification requests to the classifier. The first technique introduces a simpler classifier that processes some of the input frames instead of the SVM-based classifier, and the second technique skips a portion of input frames based on strong inter-frame correlation in speech and music frames. Our experimental results show that the proposed techniques can reduce the computational cost of the SVM-based classifier by 95.4 % with negligible performance degradation, making it plausible for integration into the SMV codec.

引用

页码：5375 / 5400

页数：26

共 18 条

[11] Discriminative Weight Training for Support Vector Machine-Based Speech/Music Classification in 3GPP2 SMV Codec [J].

Kim, Sang-Kyun ;

Chang, Joon-Hyuk .

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2010, E93A (01) :316-319

[12] Speech/Music Classification Enhancement for 3GPP2 SMV Codec Based on Support Vector Machine [J].

Kim, Sang-Kyun ;

Chang, Joon-Hyuk .

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2009, E92A (02) :630-632

[13] A Decision-Tree-Based Algorithm for Speech/Music Classification and Segmentation [J].

Lavner, Yizhar ;

Ruinskiy, Dima .

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2009,

[14] 7 KHZ AUDIO CODING WITHIN 64 KBIT/S [J].

MAITRE, X .

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 1988, 6 (02) :283-298

[15] Intended human object detection for automatically protecting privacy in mobile video surveillance [J].

Nakashima, Yuta ;

Babaguchi, Noboru ;

Fan, Jianping .

MULTIMEDIA SYSTEMS, 2012, 18 (02) :157-173

[16]

Nguyen D., 2005, P 22 INT C MACH LEAR, P617

[17] An overview of statistical learning theory [J].

Vapnik, VN .

IEEE TRANSACTIONS ON NEURAL NETWORKS, 1999, 10 (05) :988-999

[18] Design efficient support vector machine for fast classification [J].

Zhan, YQ ;

Shen, DG .

PATTERN RECOGNITION, 2005, 38 (01) :157-161

← 1 2 →