A fully automated approach for baby cry signal segmentation and boundary detection of expiratory and inspiratory episodes

被引:22
作者
Abou-Abbas, Lina [1 ,2 ]
Tadj, Chakib [1 ]
Fersaie, Hesam Alaie [1 ]
机构
[1] Quebec Univ, Dept Elect Engn, Ecole Technol Super, 1100 Rue Notre Dame Ouest, Montreal, PQ H3C 1K3, Canada
[2] McGill Univ, Hlth Ctr, 5252 Blvd Maisonneuve, Montreal, PQ H4A 3S5, Canada
关键词
INFANT CRY; CLASSIFICATION; RECOGNITION; IDENTIFICATION; TRACKING; SPECTRUM; VOICE;
D O I
10.1121/1.5001491
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The detection of cry sounds is generally an important pre-processing step for various applications involving cry analysis such as diagnostic systems, electronic monitoring systems, emotion detection, and robotics for baby caregivers. Given its complexity, an automatic cry segmentation system is a rather challenging topic. In this paper, a framework for automatic cry sound segmentation for application in a cry-based diagnostic system has been proposed. The contribution of various additional time-and frequency-domain features to increase the robustness of a Gaussian mixture model/hidden Markov model (GMM/HMM)-based cry segmentation system in noisy environments is studied. A fully automated segmentation algorithm to extract cry sound components, namely, audible expiration and inspiration, is introduced and is grounded on two approaches: statistical analysis based on GMMs or HMMs classifiers and a post-processing method based on intensity, zero crossing rate, and fundamental frequency feature extraction. The main focus of this paper is to extend the systems developed in previous works to include a post-processing stage with a set of corrective and enhancing tools to improve the classification performance. This full approach allows to precisely determine the start and end points of the expiratory and inspiratory components of a cry signal, EXP and INSV, respectively, in any given sound signal. Experimental results have indicated the effectiveness of the proposed solution. EXP and INSV detection rates of approximately 94.29% and 92.16%, respectively, were achieved by applying a tenfold cross-validation technique to avoid over-fitting. (C) 2017 Author(s).
引用
收藏
页码:1318 / 1331
页数:14
相关论文
共 46 条
[1]   Expiratory and Inspiratory Cries Detection Using Different Signals' Decomposition Techniques [J].
Abou-Abbas, Lina ;
Tadj, Chakib ;
Gargour, Christian ;
Montazeri, Leila .
JOURNAL OF VOICE, 2017, 31 (02) :259.e13-259.e28
[2]  
Abou-Abbas L, 2015, I CON ADV BIOMED ENG, P262, DOI 10.1109/ICABME.2015.7323302
[3]  
Abou-Abbas L, 2015, CAN CON EL COMP EN, P796, DOI 10.1109/CCECE.2015.7129376
[4]   Automatic detection of the expiratory and inspiratory phases in newborn cry signals [J].
Abou-Abbas, Lina ;
Alaie, Hesam Fersaie ;
Tadj, Chakib .
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2015, 19 :35-43
[5]   Cry-based infant pathology classification using GMMs [J].
Alaie, Hesam Farsaie ;
Abou-Abbas, Lina ;
Tadj, Chakib .
SPEECH COMMUNICATION, 2016, 77 :28-52
[6]  
[Anonymous], 1993, FUNDAMENTALS SPEECH
[7]   Segmentation of expiratory and inspiratory sounds in baby cry audio recordings using hidden Markov models [J].
Aucouturier, Jean-Julien ;
Nonaka, Yulri ;
Katahira, Kentaro ;
Okanoya, Kazuo .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2011, 130 (05) :2969-2977
[8]   Voiced/Unvoiced Decision for Speech Signals Based on Zero-Crossing Rate and Energy [J].
Bachu, R. G. ;
Kopparthi, S. ;
Adapa, B. ;
Barkana, B. D. .
ADVANCES TECHNIQUES IN COMPUTING SCIENCES AND SOFTWARE ENGINEERING, 2010, :279-282
[9]  
Barr RonaldG., 2006, Encyclopedia on Early Childhood Development, P1
[10]   ITU-T recommendation G.729 Annex B: A silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications [J].
Benyassine, A ;
Shlomot, E ;
Su, HY ;
Massaloux, D ;
Lamblin, C ;
Petit, JP .
IEEE COMMUNICATIONS MAGAZINE, 1997, 35 (09) :64-73