Investigating the Effective Dynamic Information of Spectral Shapes for Audio Classification

被引:0
作者
Chen, Liangwei [1 ]
Zhou, Xiren [2 ]
Chen, Qiuju [3 ]
Xiong, Fang [4 ]
Chen, Huanhuan [2 ]
机构
[1] Univ Sci & Technol China, Sch Artificial Intelligence & Data Sci, Hefei 230027, Peoples R China
[2] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230027, Peoples R China
[3] Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei 230027, Peoples R China
[4] Cent South Univ, Xiangya Hosp, Natl Clin Res Ctr Geriatr Dis, Dept Otolaryngol Head & Neck Surg, Changsha 410078, Peoples R China
基金
国家重点研发计划;
关键词
Mel frequency cepstral coefficient; Data models; Spectral shape; Computational modeling; Feature extraction; Fitting; Music; Classification algorithms; Training; Multiple signal classification; Learning in the model space; dynamic information of the spectral shape; audio classification; mel-frequency cepstral coefficients; echo state network; MUSICAL GENRE CLASSIFICATION; FAULT-DIAGNOSIS; MODEL SPACE; RECOGNITION; ALGORITHM;
D O I
10.1109/TMM.2024.3521837
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The spectral shape holds crucial information for Audio Classification (AC), encompassing the spectrum's envelope, details, and dynamic changes over time. Conventional methods utilize cepstral coefficients for spectral shape description but overlook its variation details. Deep-learning approaches capture some dynamics but demand substantial training or fine-tuning resources. The Learning in the Model Space (LMS) framework precisely captures the dynamic information of temporal data by utilizing model fitting, even when computational resources and data are limited. However, applying LMS to audio faces challenges: 1) The high sampling rate of audio hinders efficient data fitting and capturing of dynamic information. 2) The Dynamic Information of Partial Spectral Shapes (DIPSS) may enhance classification, as only specific spectral shapes are relevant for AC. This paper extends an AC framework called Effective Dynamic Information Capture (EDIC) to tackle the above issues. EDIC constructs Mel-Frequency Cepstral Coefficients (MFCC) sequences within different dimensional intervals as the fitted data, which not only reduces the number of sequence sampling points but can also describe the change of the spectral shape in different parts over time. EDIC enables us to implement a topology-based selection algorithm in the model space, selecting effective DIPSS for the current AC task. The performance on three tasks confirms the effectiveness of EDIC.
引用
收藏
页码:1114 / 1126
页数:13
相关论文
共 44 条
  • [31] Hyperspectral Image Classification Based on Stacked Contractive Autoencoder Combined With Adaptive Spectral-Spatial Information
    Guo, Pengyue
    Liu, Zhenbing
    Lu, Haoxiang
    Wang, Zimin
    IEEE ACCESS, 2021, 9 : 96404 - 96415
  • [32] Classification of Muscle Fatigue in Dynamic Contraction Using Surface Electromyography Signals and Multifractal Singularity Spectral Analysis
    Marri, Kiran
    Swaminathan, Ramakrishnan
    JOURNAL OF DYNAMIC SYSTEMS MEASUREMENT AND CONTROL-TRANSACTIONS OF THE ASME, 2016, 138 (11):
  • [33] Sparse-FCM and deep learning for effective classification of land area in multi-spectral satellite images
    Gavade, Anil B.
    Rajpurohit, Vijay S.
    EVOLUTIONARY INTELLIGENCE, 2022, 15 (02) : 1185 - 1201
  • [34] AN EFFECTIVE APPLICATION OF CONTEXTUAL INFORMATION USING ADJACENCY PAIRS AND A DISCOURSE STACK FOR SPEECH-ACT CLASSIFICATION
    Kim, Kyungsun
    Ko, Youngjoong
    Seo, Jungyun
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2012, 8 (11): : 7713 - 7728
  • [35] Non-destructive recognition and classification of citrus fruit blemishes based on ant colony optimized spectral information
    Zhang, Yao
    Lee, Won Suk
    Li, Minzan
    Zheng, Lihua
    Ritenour, Mark A.
    POSTHARVEST BIOLOGY AND TECHNOLOGY, 2018, 143 : 119 - 128
  • [36] An effective arrhythmia classification via ECG signal subsampling and mutual information based subbands statistical features selection
    Mian Qaisar, Saeed
    Hussain, Syed Fawad
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 14 (3) : 1473 - 1487
  • [37] Urban land cover classification from very high resolution imagery using spectral and invariant moment shape information
    Xu, Haiqing
    Li, Peijun
    CANADIAN JOURNAL OF REMOTE SENSING, 2010, 36 (03) : 248 - 260
  • [38] Naive Bayes switching linear dynamical system: A model for dynamic system modelling, classification, and information fusion
    Dabrowski, Joel Janek
    de Villiers, Johan Pieter
    Beyers, Conrad
    INFORMATION FUSION, 2018, 42 : 75 - 101
  • [39] A New Effective and Robust Index Using Spectral Curve Shapes and Local Reflectance Peaks: Detecting Simple Coated Steel-Tiled Houses in Urban Areas
    Zhao, Chuanwu
    Pan, Yaozhong
    Wu, Hanyi
    Zhu, Yu
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [40] Joint bilateral filtering and spectral similarity-based sparse representation: A generic framework for effective feature extraction and data classification in hyperspectral imaging
    Qiao, Tong
    Yang, Zhijing
    Ren, Jinchang
    Yuen, Peter
    Zhao, Huimin
    Sun, Genyun
    Marshall, Stephen
    Benediktsson, Jon Atli
    PATTERN RECOGNITION, 2018, 77 : 316 - 328