Audio Scanning Network: Bridging Time and Frequency Domains for Audio Classification

被引：0

作者：

Chen, Liangwei ^{[1
]}

Zhou, Xiren ^{[2
]}

Chen, Huanhuan ^{[2
]}

机构：

[1] Univ Sci & Technol China, Sch Data Sci, Hefei, Peoples R China

[2] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei, Peoples R China

来源：

THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 10 | 2024年

基金：

国家重点研发计划;

关键词：

CANONICAL CORRELATION-ANALYSIS; RECOGNITION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With the rapid growth of audio data, there's a pressing need for automatic audio classification. As a type of time-series data, audio exhibits waveform fluctuations in both the time and frequency domains that evolve over time, with similar instances sharing consistent patterns. This study introduces the Audio Scanning Network (ASNet), designed to leverage abundant information for achieving stable and effective audio classification. ASNet captures real-time changes in audio waveforms across both time and frequency domains through reservoir computing, supported by Reservoir Kernel Canonical Correlation Analysis (RKCCA) to explore correlations between time-domain and frequency-domain waveform fluctuations. This innovative approach empowers ASNet to comprehensively capture the changes and inherent correlations within the audio waveform, and without the need for time-consuming iterative training. Instead of converting audio into spectrograms, ASNet directly utilizes audio feature sequences to uncover associations between time and frequency fluctuations. Experiments on environmental sound and music genre classification tasks demonstrate ASNet's comparable performance to state-of-the-art methods.

引用

页码：11355 / +

页数：10

共 50 条

[31] Hierarchical Classification of Bird Species Using Their Audio Recorded Songs
Silla, Carlos N., Jr.
Kaestner, Celso A. A.
2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 1895 - 1900
[32] Affective Classification of Generic Audio Clips using Regression Models
Malandrakis, Nikolaos
Sundaram, Shiva
Potamianos, Alexandros
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2831 - 2835
[33] What Makes Audio Event Detection Harder than Classification?
Huy Phan
Koch, Philipp
Katzberg, Fabrice
Maass, Marco
Mazur, Radoslaw
McLoughlin, Ian
Mertins, Alfred
2017 25TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2017, : 2739 - 2743
[34] Classification of audio scenes with novel features in a fused system framework
Waldekar, Shefali
Saha, Goutam
DIGITAL SIGNAL PROCESSING, 2018, 75 : 71 - 82
[35] Adaptive Mid-Term Representations for Robust Audio Event Classification
Martin-Morato, Irene
Cobos, Maximo
Ferri, Francesc J.
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (12) : 2381 - 2392
[36] Benchmarking Audio Signal Representation Techniques for Classification with Convolutional Neural Networks
Sharan, Roneel, V
Xiong, Hao
Berkovsky, Shlomo
SENSORS, 2021, 21 (10)
[37] What Affects the Performance of Convolutional Neural Networks for Audio Event Classification
Wang, Helin
Chong, Dading
Huang, Dongyan
Zou, Yuexian
2019 8TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS (ACIIW), 2019, : 140 - 146
[38] Multiple Classifier Systems for the Classification of Audio-Visual Emotional States
Glodek, Michael
Tschechne, Stephan
Layher, Georg
Schels, Martin
Brosch, Tobias
Scherer, Stefan
Kaechele, Markus
Schmidt, Miriam
Neumann, Heiko
Palm, Guenther
Schwenker, Friedhelm
AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION, PT II, 2011, 6975 : 359 - 368
[39] Investigation of Spoken-Language Detection and Classification in Broadcasted Audio Content
Kotsakis, Rigas
Matsiola, Maria
Kalliris, George
Dimoulas, Charalampos
INFORMATION, 2020, 11 (04)
[40] CLASSIFICATION OF AUDIO SCENES USING NARROW-BAND AUTOCORRELATION FEATURES
Valero, Xavier
Alias, Francesc
2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 2015 - 2019

← 1 2 3 4 5 →