Adaptive infant cry classification using multi-armed bandit modality selection in an attentive convolutional recurrent neural network model

被引：0

作者：

Owino, Geofrey ^{[1
]}

Kamanu, Timothy ^{[1
]}

Ndiritu, John ^{[1
]}

Kikechi, Conlet Biketi ^{[2
]}

机构：

[1] Univ Nairobi, Dept Math, Nairobi, Kenya

[2] Pwani Univ, Dept Math & Comp Sci, Kilifi, Kenya

来源：

COMPLEX & INTELLIGENT SYSTEMS | 2025年 / 11卷 / 09期

关键词：

Infant cry classification; Multi-armed bandit (MAB); Convolutional Recurrent Neural Network (CRNN); Attention mechanism; Feature extraction; Adaptive audio processing; FEATURES;

D O I：

10.1007/s40747-025-02000-w

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Accurate and timely detection of infant needs through their suggestive cries is crucial for effective intervention and improved well-being. However, existing infant cry classification methods often struggle with the inherent variability of cries, long-term dependencies within cry patterns, and a lack of adaptability to background noise and individual differences. This paper introduces a novel "Adaptive Infant Cry Classification" model that addresses these limitations by dynamically selecting the most informative features from acoustic, spectral, and temporal domains using a multi-armed bandit (MAB) approach. The adaptive feature selection strategy, integrated within an Attentive Convolutional Recurrent Neural Network architecture, enhances the ability of the model to capture both temporal and spectral patterns in infant cries, leading to improved accuracy, precision, and robustness. Evaluated on a comprehensive dataset of infant cry recordings from Baby Chilanto and Donate Cry databases, our model achieves state-of-the-art performance, demonstrating its potential for real-world applications, including early detection of infant distress, infants' personalized care plans, and the development of new interventions. Experimental results demonstrated significant improvements in classification accuracy (97%) and robustness compared to conventional classical methods. Notably, the proposed framework surpasses standard baseline CNN-RNN-based classifiers by 5-7% across multiple cry types, reducing overall error rates from around 12% to just under 5%. Ablation studies reveal that the MAB-based feature selection contributes up to a 10% relative increase in accuracy compared to static methods, while the attention components provide an additional 5% improvement. Combined, these features lead to a 10% absolute gain in F1-score under high noise conditions. This shows the model's suitability for clinical and home-based environments, aiming to improve artificial parenting anytime and anywhere.

引用

页数：16

共 34 条

[1] Infant cry classification by MFCC feature extraction with MLP and CNN structures [J].