Environmental sound recognition on embedded devices using deep learning: a review

被引:1
作者
Gairi, Pau [1 ]
Palleja, Tomas [1 ]
Tresanchez, Marcel [1 ]
机构
[1] Univ Lleida, Res Grp Log Optimizat & Robot, Dep Ind Engn & Bldg, Jaume II,69, Lleida 25001, Spain
关键词
Sound recognition; Audio classification; Deep learning techniques; Edge device; Real-time sensing; Resource-constrained devices; EVENT DETECTION; SYSTEM; CLASSIFICATION; INTERNET; SIGNAL;
D O I
10.1007/s10462-025-11106-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sound recognition has a wide range of applications beyond speech and music, including environmental monitoring, sound source classification, mechanical fault diagnosis, audio fingerprinting, and event detection. These applications often require real-time data processing, making them well-suited for embedded systems. However, embedded devices face significant challenges due to limited computational power, memory, and low power consumption. Despite these constraints, achieving high performance in environmental sound recognition typically requires complex algorithms. Deep Learning models have demonstrated high accuracy on existing datasets, making them a popular choice for such tasks. However, these models are resource-intensive, posing challenges for real-time edge applications. This paper presents a comprehensive review of integrating Deep Learning models into embedded systems, examining their state-of-the-art applications, key components, and steps involved. It also explores strategies to optimise performance in resource-constrained environments through a comparison of various implementation approaches such as knowledge distillation, pruning, and quantization, with studies achieving a reduction in complexity of up to 97% compared to the unoptimized model. Overall, we conclude that in spite of the availability of lightweight deep learning models, input features, and compression techniques, their integration into low-resource devices, such as microcontrollers, remains limited. Furthermore, more complex tasks, such as general sound classification, especially with expanded frequency bands and real-time operation have yet to be effectively implemented on these devices. These findings highlight the need for a standardised research framework to evaluate these technologies applied to resource-constrained devices, and for further development to realise the wide range of potential applications.
引用
收藏
页数:35
相关论文
共 87 条
[21]   A Novel Snore Detection and Suppression Method for a Flexible Patch With MEMS Microphone and Accelerometer [J].
He, Chunhua ;
Tan, Jiewen ;
Jian, Xuelei ;
Zhong, Guangxiong ;
Wu, Heng ;
Cheng, Lianglun ;
Lin, Juze .
IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (24) :25791-25804
[22]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[23]  
Hinton G, 2015, Arxiv, DOI [arXiv:1503.02531, DOI 10.48550/ARXIV.1503.02531]
[24]   Intelligent Microsystem for Sound Event Recognition in Edge Computing Using End-to-End Mesh Networking [J].
Hou, Lulu ;
Duan, Wenrui ;
Xuan, Guozhe ;
Xiao, Shanpeng ;
Li, Yuan ;
Li, Yizheng ;
Zhao, Jiahao .
SENSORS, 2023, 23 (07)
[25]   TinyChirp: Bird Song Recognition Using TinyML Models on Low-power Wireless Acoustic Sensors [J].
Huang, Z. ;
Tousnakhoff, A. ;
Kozyr, P. ;
Rehausen, R. ;
Biessmann, F. ;
Lachlan, R. ;
Adjih, C. ;
Baccelli, E. .
2024 IEEE 5TH INTERNATIONAL SYMPOSIUM ON THE INTERNET OF SOUNDS, IS2 2024, 2024, :74-83
[26]   Deep learning-based noise robust flexible piezoelectric acoustic sensors for speech processing [J].
Jung, Young Hoon ;
Pham, Trung Xuan ;
Issa, Dias ;
Wang, Hee Seung ;
Lee, Jae Hee ;
Chung, Mingi ;
Lee, Bo-Yeon ;
Kim, Gwangsu ;
Yoo, Chang D. ;
Lee, Keon Jae .
NANO ENERGY, 2022, 101
[27]   Sound-based remote real-time multi-device operational monitoring system using a Convolutional Neural Network (CNN) [J].
Kim, Jisoo ;
Lee, Hyunsu ;
Jeong, Suhwan ;
Ahn, Sung-Hoon .
JOURNAL OF MANUFACTURING SYSTEMS, 2021, 58 :431-441
[28]   Real-Time Sound Source Localization for Low-Power IoT Devices Based on Multi-Stream CNN [J].
Ko, Jungbeom ;
Kim, Hyunchul ;
Kim, Jungsuk .
SENSORS, 2022, 22 (12)
[29]   A real-time bird sound recognition system using a low-cost microcontroller [J].
Kucuktopcu, Okan ;
Masazade, Engin ;
Unsalan, Cem ;
Varshney, Pramod K. .
APPLIED ACOUSTICS, 2019, 148 :194-201
[30]   EdgeL3: Compressing L3-Net for Mote-Scale Urban Noise Monitoring [J].
Kumari, Sangeeta ;
Roy, Dhrubojyoti ;
Cartwright, Mark ;
Bello, Juan Pablo ;
Arora, Anish .
2019 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2019, :877-884