A Study of Features and Deep Neural Network Architectures and Hyper-Parameters for Domestic Audio Classification

被引:13
|
作者
Copiaco, Abigail [1 ]
Ritz, Christian [2 ]
Abdulaziz, Nidhal [1 ]
Fasciani, Stefano [3 ]
机构
[1] Univ Wollongong Dubai, Fac Engn & Informat Sci, Dubai 20183, U Arab Emirates
[2] Univ Wollongong, Sch Elect Comp & Telecommun Engn, Northfields Ave, Wollongong, NSW 2522, Australia
[3] Univ Oslo, Dept Musicol, Sem Saelands Vei 2, N-0371 Oslo, Norway
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 11期
关键词
neural network; transfer learning; scalograms; MFCC; Log-mel; pre-trained models; ACOUSTIC EVENT;
D O I
10.3390/app11114880
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Featured Application The algorithms explored in this research can be used for any multi-level classification applications. Recent methodologies for audio classification frequently involve cepstral and spectral features, applied to single channel recordings of acoustic scenes and events. Further, the concept of transfer learning has been widely used over the years, and has proven to provide an efficient alternative to training neural networks from scratch. The lower time and resource requirements when using pre-trained models allows for more versatility in developing system classification approaches. However, information on classification performance when using different features for multi-channel recordings is often limited. Furthermore, pre-trained networks are initially trained on bigger databases and are often unnecessarily large. This poses a challenge when developing systems for devices with limited computational resources, such as mobile or embedded devices. This paper presents a detailed study of the most apparent and widely-used cepstral and spectral features for multi-channel audio applications. Accordingly, we propose the use of spectro-temporal features. Additionally, the paper details the development of a compact version of the AlexNet model for computationally-limited platforms through studies of performances against various architectural and parameter modifications of the original network. The aim is to minimize the network size while maintaining the series network architecture and preserving the classification accuracy. Considering that other state-of-the-art compact networks present complex directed acyclic graphs, a series architecture proposes an advantage in customizability. Experimentation was carried out through Matlab, using a database that we have generated for this task, which composes of four-channel synthetic recordings of both sound events and scenes. The top performing methodology resulted in a weighted F1-score of 87.92% for scalogram features classified via the modified AlexNet-33 network, which has a size of 14.33 MB. The AlexNet network returned 86.24% at a size of 222.71 MB.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] Effect of hyper-parameters on the performance of ConvLSTM based deep neural network in crop classification
    Durrani, Awab Ur Rashid
    Minallah, Nasru
    Aziz, Najam
    Frnda, Jaroslav
    Khan, Waleed
    Nedoma, Jan
    PLOS ONE, 2023, 18 (02):
  • [2] Constraints on Hyper-parameters in Deep Learning Convolutional Neural Networks
    Al-Saggaf, Ubaid M.
    Botalb, Abdelaziz
    Faisal, Muhammad
    Moinuddin, Muhammad
    Alsaggaf, Abdulrahman U.
    Alfakeh, Sulhi Ali
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (11) : 439 - 449
  • [3] Exploiting Parameters Learning for Hyper-parameters Optimization in Deep Neural Networks
    Fraccaroli, Michele
    Lamma, Evelina
    Riguzzi, Fabrizio
    ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2022, 364
  • [4] DEEP NEURAL NETWORK DERIVED BOTTLENECK FEATURES FOR ACCURATE AUDIO CLASSIFICATION
    Zhang, Bihong
    Xie, Lei
    Yuan, Yougen
    Ming, Huaiping
    Huang, Dongyan
    Song, Mingli
    2016 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2016,
  • [5] Using an Evolutionary Algorithm to Optimize the Hyper-parameters of a Cascading Neural Network
    Vos, Angus
    Plested, Jo
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT IV, 2019, 1142 : 758 - 765
  • [6] Improving deep neural network using hyper-parameters tuning in predicting the bearing capacity of shallow foundations
    Tuan Anh Pham
    Huong-Lan Thi Vu
    Hong-Anh Thi Duong
    JOURNAL OF APPLIED SCIENCE AND ENGINEERING, 2022, 25 (02): : 261 - +
  • [7] Response surface methodology to tune artificial neural network hyper-parameters
    Keser, Sinem Bozkurt
    Sahin, Yeliz Buruk
    EXPERT SYSTEMS, 2021, 38 (08)
  • [8] Convolutional Neural Network Hyper-Parameters Optimization based on Genetic Algorithms
    Loussaief, Sehla
    Abdelkrim, Afef
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (10) : 252 - 266
  • [9] A genetic mixed-integer optimization of neural network hyper-parameters
    Spurlock, Kyle
    Elgazzar, Heba
    JOURNAL OF SUPERCOMPUTING, 2022, 78 (12): : 14680 - 14702
  • [10] A genetic mixed-integer optimization of neural network hyper-parameters
    Kyle Spurlock
    Heba Elgazzar
    The Journal of Supercomputing, 2022, 78 : 14680 - 14702