A new approach for classification of generic audio data

被引：4

作者：

Lin, RS ^{[1
]}

Chen, LH ^{[1
]}

机构：

[1] Natl Chiao Tung Univ, Dept Comp & Informat Sci, Hsinchu 30050, Taiwan

来源：

INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE | 2005年 / 19卷 / 01期

关键词：

audio classification; spectrogram; Bayesian decision function; multivariable Gaussian distribution;

D O I：

10.1142/S0218001405003958

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The existing audio retrieval systems fall into one of two categories: single-domain systems that can accept data of only a single type (e.g. speech) or multiple-domain systems that offer content-based retrieval for multiple types of audio data. Since a single-domain system has limited applications, a multiple-domain system will be more useful. However, different types of audio data will have different properties, this will make a multiple-domain system harder to be developed. If we can classify audio information in advance, the above problems can be solved. In this paper, we will propose a real-time classification method to classify audio signals into several basic audio types such as pure speech, music, song, speech with music background, and speech with environmental noise background. In order to make the proposed method robust for a variety of audio sources, we use Bayesian decision function for multivariable Gaussian distribution instead of manually adjusting a threshold for each discriminator. The proposed approach can be applied to content-based audio/video retrieval. In the experiment, the efficiency and effectiveness of this method are shown by an accuracy rate of more than 96% for general audio data classification.

引用

页码：63 / 78

页数：16

共 50 条

[21] Classification of Human Indoor Activities with Resource Constrained Network Architectures on Audio Data [J].

Anand, Joseph ;

Koch, Marcel ;

Schlenke, Fabian ;

Kohlmorgen, Fabian ;

Wohrle, Hendrik .

2022 IEEE 5TH INTERNATIONAL CONFERENCE AND WORKSHOP OBUDA ON ELECTRICAL AND POWER ENGINEERING, CANDO-EPE, 2022, :157-162

[22] Deep Learning in Audio Classification [J].

Wang, Yaqin ;

Wei-Kocsis, Jin ;

Springer, John A. ;

Matson, Eric T. .

INFORMATION AND SOFTWARE TECHNOLOGIES, ICIST 2022, 2022, 1665 :64-77

[23] Audio Classification with Thermodynamic Criteria [J].

Singh, Rita .

2014 IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING (IC2E), 2014, :526-533

[24] Audio Classification Utilizing a Rule-based approach and the Support Vector Machine Classifier [J].

Vavrek, Jozef ;

Juhar, Jozef ;

Cizmar, Anton .

2013 36TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2013, :512-516

[25] Blind Source Separation Approach for Audio Signals based on Support Vector Machine Classification [J].

Abouzid, H. ;

Chakkor, O. .

ICCWCS'17: PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTING AND WIRELESS COMMUNICATION SYSTEMS, 2017,

[26] Audiogmenter: a MATLAB toolbox for audio data augmentation [J].

Maguolo, Gianluca ;

Paci, Michelangelo ;

Nanni, Loris ;

Bonan, Ludovico .

APPLIED COMPUTING AND INFORMATICS, 2025, 21 (1/2) :152-163

[27] Scaling up masked audio encoder learning for general audio classification [J].

Dinkel, Heinrich ;

Yan, Zhiyong ;

Wang, Yongqing ;

Zhang, Junbo ;

Wang, Yujun ;

Wang, Bin .

INTERSPEECH 2024, 2024, :547-551

[28] An Empirical Analysis of Perforated Audio Classification [J].

Monjur, Mahathir ;

Nirjon, Shahriar .

PROCEEDINGS OF THE 2022 1ST ACM INTERNATIONAL WORKSHOP ON INTELLIGENT ACOUSTIC SYSTEMS AND APPLICATIONS, IASA 2022, 2022, :25-30

[29] A Physiologically Inspired Method for Audio Classification [J].

Sourabh Ravindran ;

Kristopher Schlemmer ;

David V. Anderson .

EURASIP Journal on Advances in Signal Processing, 2005

[30] Kernel-based audio classification [J].

Li, Xiao-Li ;

Du, Zhen-Long ;

Zhang, Ya-Fen .

PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2006, :3313-+

← 1 2 3 4 5 →