Adaptive infant cry classification using multi-armed bandit modality selection in an attentive convolutional recurrent neural network model

被引:0
作者
Owino, Geofrey [1 ]
Kamanu, Timothy [1 ]
Ndiritu, John [1 ]
Kikechi, Conlet Biketi [2 ]
机构
[1] Univ Nairobi, Dept Math, Nairobi, Kenya
[2] Pwani Univ, Dept Math & Comp Sci, Kilifi, Kenya
关键词
Infant cry classification; Multi-armed bandit (MAB); Convolutional Recurrent Neural Network (CRNN); Attention mechanism; Feature extraction; Adaptive audio processing; FEATURES;
D O I
10.1007/s40747-025-02000-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Accurate and timely detection of infant needs through their suggestive cries is crucial for effective intervention and improved well-being. However, existing infant cry classification methods often struggle with the inherent variability of cries, long-term dependencies within cry patterns, and a lack of adaptability to background noise and individual differences. This paper introduces a novel "Adaptive Infant Cry Classification" model that addresses these limitations by dynamically selecting the most informative features from acoustic, spectral, and temporal domains using a multi-armed bandit (MAB) approach. The adaptive feature selection strategy, integrated within an Attentive Convolutional Recurrent Neural Network architecture, enhances the ability of the model to capture both temporal and spectral patterns in infant cries, leading to improved accuracy, precision, and robustness. Evaluated on a comprehensive dataset of infant cry recordings from Baby Chilanto and Donate Cry databases, our model achieves state-of-the-art performance, demonstrating its potential for real-world applications, including early detection of infant distress, infants' personalized care plans, and the development of new interventions. Experimental results demonstrated significant improvements in classification accuracy (97%) and robustness compared to conventional classical methods. Notably, the proposed framework surpasses standard baseline CNN-RNN-based classifiers by 5-7% across multiple cry types, reducing overall error rates from around 12% to just under 5%. Ablation studies reveal that the MAB-based feature selection contributes up to a 10% relative increase in accuracy compared to static methods, while the attention components provide an additional 5% improvement. Combined, these features lead to a 10% absolute gain in F1-score under high noise conditions. This shows the model's suitability for clinical and home-based environments, aiming to improve artificial parenting anytime and anywhere.
引用
收藏
页数:16
相关论文
共 34 条
[1]   Infant cry classification by MFCC feature extraction with MLP and CNN structures [J].
Abbaskhah, Ahmad ;
Sedighi, Hamed ;
Marvi, Hossein .
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 86
[2]  
Alagundi D, 2024, 2024 IEEE INT C CONT, V1, P1, DOI [10.1109/InC460750.2024.10649119, DOI 10.1109/INC460750.2024.10649119]
[3]   Deep Learning Assisted Neonatal Cry Classification via Support Vector Machine Models [J].
Ashwini, K. ;
Vincent, P. M. Durai Raj ;
Srinivasan, Kathiravan ;
Chang, Chuan-Yu .
FRONTIERS IN PUBLIC HEALTH, 2021, 9
[4]   Generic Outlier Detection in Multi-Armed Bandit [J].
Ban, Yikun ;
He, Jingrui .
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, :913-923
[5]  
Bella Vinka, 2023, 2023 7th International Conference on New Media Studies (CONMEDIA), P250, DOI 10.1109/CONMEDIA60526.2023.10428158
[6]   Newborn infant's cry analysis [J].
Chittora, Anshu ;
Patil, Hemant A. .
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2016, 19 (04) :919-928
[7]   A self-training automatic infant-cry detector [J].
Coro, Gianpaolo ;
Bardelli, Serena ;
Cuttano, Armando ;
Scaramuzzo, Rosa T. T. ;
Ciantelli, Massimiliano .
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (11) :8543-8559
[8]   Survey on speech emotion recognition: Features, classification schemes, and databases [J].
El Ayadi, Moataz ;
Kamel, Mohamed S. ;
Karray, Fakhri .
PATTERN RECOGNITION, 2011, 44 (03) :572-587
[9]  
Goodfellow I, 2016, ADAPT COMPUT MACH LE, P1
[10]   Normal and hypoacoustic infant cry signal classification using time-frequency analysis and general regression neural network [J].
Hariharan, M. ;
Sindhu, R. ;
Yaacob, Sazali .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2012, 108 (02) :559-569