Evaluating Noise-Robustness of Convolutional and Recurrent Neural Networks for Baby Cry Recognition

被引:0
作者
Renanti, Medhanita Dewi [1 ,2 ]
Buono, Agus [3 ]
Priandana, Karlisa [3 ]
Wijaya, sony Hartono [3 ]
机构
[1] IPB Univ, Doctoral Study Program Comp Dept, Bogor, Indonesia
[2] IPB Univ, Coll Vocat Studies, Software Engn Technol, Bogor, Indonesia
[3] IPB Univ, Dept Comp, Bogor, Indonesia
关键词
Baby cry recognition; deep learning; gated recurrent unit; long short-term memory; noise robustness; signal- to-noise ratio;
D O I
10.14569/IJACSA.2024.0150660
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Reliable baby cry recognition plays a crucial role in infant care and monitoring, yet real-world environment poses challenges to system accuracy due to its background noises. This study proposes a novel CNN architecture for baby cry recognition under varying noise conditions, featuring three convolutional layers, a max pooling layer, and 0.5 dropout set, and compares its performance against standard RNN models. The models were trained for 100 epochs with a batch size of 64 and evaluated in both clean and noisy environments. To simulate real-world scenarios, recordings were transformed into audio signals and subjected to varying levels of background noise, particularly at different signal-to-noise ratios (SNRs). Results indicate that both models achieved high accuracy (>89%) in noise-free conditions. However, the proposed CNN maintained higher precision (93%) and overall accuracy (91%) than the RNN under 10dB noise, demonstrating its superior noise robustness for baby cry recognition. This improvement is attri buted to the CNN's capacity to capture spatial features in audio signals, making it susceptible to noise disruptions. These findings contribute to the development of more reliable and robust baby cry recognition systems.
引用
收藏
页码:585 / 593
页数:9
相关论文
共 50 条
[41]   Human Activity Classification With Radar: Optimization and Noise Robustness With Iterative Convolutional Neural Networks Followed With Random Forests [J].
Lin, Yier ;
Le Kernec, Julien ;
Yang, Shufan ;
Fioranelli, Francesco ;
Romain, Olivier ;
Zhao, Zhiqin .
IEEE SENSORS JOURNAL, 2018, 18 (23) :9669-9681
[42]   MEMORY VISUALIZATION FOR GATED RECURRENT NEURAL NETWORKS IN SPEECH RECOGNITION [J].
Tang, Zhiyuan ;
Shi, Ying ;
Wang, Dong ;
Feng, Yang ;
Zhang, Shiyue .
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, :2736-2740
[43]   Music Artist Classification with Convolutional Recurrent Neural Networks [J].
Nasrullah, Zain ;
Zhao, Yue .
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[44]   Convolutional and Recurrent Neural Networks for Face Image Analysis [J].
Yuksel, Kivanc ;
Skarbek, Wladyslaw .
FOUNDATIONS OF COMPUTING AND DECISION SCIENCES, 2019, 44 (03) :331-347
[45]   Benchmarking Convolutional and Recurrent Neural Networks for Malware Classification [J].
Safa, Haidar ;
Nassar, Mohamed ;
Al Orabi, Wael Al Rahal .
2019 15TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE (IWCMC), 2019, :561-566
[46]   Convolutional Recurrent Neural Networks for Hyperspectral Data Classification [J].
Wu, Hao ;
Prasad, Saurabh .
REMOTE SENSING, 2017, 9 (03)
[47]   Convolutional-de-convolutional neural networks for recognition of surgical workflow [J].
Chen, Yu-wen ;
Zhang, Ju ;
Wang, Peng ;
Hu, Zheng-yu ;
Zhong, Kun-hua .
FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2022, 16
[48]   Combining Very Deep Convolutional Neural Networks and Recurrent Neural Networks for Video Classification [J].
Kiziltepe, Rukiye Savran ;
Gan, John Q. ;
Escobar, Juan Jose .
ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2019, PT II, 2019, 11507 :811-822
[49]   Optical Music Recognition by Recurrent Neural Networks [J].
Baro, Arnau ;
Riba, Pau ;
Calvo-Zaragoza, Jorge ;
Fornes, Alicia .
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2017), VOL 2, 2017, :25-26
[50]   Recurrent Neural Networks for Emotion Recognition in Video [J].
Kahou, Samira Ebrahimi ;
Michalski, Vincent ;
Konda, Kishore ;
Memisevic, Roland ;
Pal, Christopher .
ICMI'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2015, :467-474