A new approach for classification of generic audio data

被引：4

作者：

Lin, RS ^{[1
]}

Chen, LH ^{[1
]}

机构：

[1] Natl Chiao Tung Univ, Dept Comp & Informat Sci, Hsinchu 30050, Taiwan

来源：

INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE | 2005年 / 19卷 / 01期

关键词：

audio classification; spectrogram; Bayesian decision function; multivariable Gaussian distribution;

D O I：

10.1142/S0218001405003958

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The existing audio retrieval systems fall into one of two categories: single-domain systems that can accept data of only a single type (e.g. speech) or multiple-domain systems that offer content-based retrieval for multiple types of audio data. Since a single-domain system has limited applications, a multiple-domain system will be more useful. However, different types of audio data will have different properties, this will make a multiple-domain system harder to be developed. If we can classify audio information in advance, the above problems can be solved. In this paper, we will propose a real-time classification method to classify audio signals into several basic audio types such as pure speech, music, song, speech with music background, and speech with environmental noise background. In order to make the proposed method robust for a variety of audio sources, we use Bayesian decision function for multivariable Gaussian distribution instead of manually adjusting a threshold for each discriminator. The proposed approach can be applied to content-based audio/video retrieval. In the experiment, the efficiency and effectiveness of this method are shown by an accuracy rate of more than 96% for general audio data classification.

引用

页码：63 / 78

页数：16

共 50 条

[1] Automatic classification of audio data
Costa, CHL
Valle, JD
Koerich, AL
2004 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN & CYBERNETICS, VOLS 1-7, 2004, : 562 - 567
[2] Automated Data Augmentation for Audio Classification
Sun, Yanjie
Xu, Kele
Liu, Chaorun
Dou, Yong
Wang, Huaimin
Ding, Bo
Pan, Qinghua
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 2716 - 2728
[3] An Audio Classification Approach Based on Machine Learning
Dan, Wu
2019 INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION, BIG DATA & SMART CITY (ICITBS), 2019, : 626 - 629
[4] AUDIO CLASSIFICATION BASED ON WEAKLY LABELED DATA
Cheng, Chieh-Feng
Anderson, David, V
Davenport, Mark A.
Rashidi, Abbas
2018 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP (SSP), 2018, : 568 - 572
[5] A new approach for audio classification and segmentation using Gabor wavelets and Fisher Linear Discriminator
Lin, RS
Chen, LH
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2005, 19 (06) : 807 - 822
[6] Dementia classification using attention mechanism on audio data
Milana, Shkhanukova
2023 IEEE 21ST WORLD SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS, SAMI, 2023, : 103 - 107
[7] Data augmentation approaches for improving animal audio classification
Nanni, Loris
Maguolo, Gianluca
Paci, Michelangelo
ECOLOGICAL INFORMATICS, 2020, 57
[8] Spectrogram Transformers for Audio Classification
Zhang, Yixiao
Li, Baihua
Fang, Hui
Meng, Qinggang
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGING SYSTEMS AND TECHNIQUES (IST 2022), 2022,
[9] LEARNING WITH OUT-OF-DISTRIBUTION DATA FOR AUDIO CLASSIFICATION
Iqbal, Turab
Cao, Yin
Kong, Qiuqiang
Plumbley, Mark D.
Wang, Wenwu
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 636 - 640
[10] A Convolutional Neural Networks Approach to Audio Classification for Rainfall Estimation
Avanzato, Roberta
Beritelli, Francesco
Di Franco, Francesco
Puglisi, Valerio Francesco
PROCEEDINGS OF THE 2019 10TH IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT DATA ACQUISITION AND ADVANCED COMPUTING SYSTEMS - TECHNOLOGY AND APPLICATIONS (IDAACS), VOL. 1, 2019, : 285 - 289

← 1 2 3 4 5 →