Automatic segmentation of infant cry signals using hidden Markov models

被引：24

作者：

Naithani, Gaurav ^{[1
]}

Kivinummi, Jaana ^{[2
]}

Virtanen, Tuomas ^{[1
]}

Tammela, Outi ^{[3
]}

Peltola, Mikko J. ^{[4
]}

Leppanen, Jukka M. ^{[2
]}

机构：

[1] Tampere Univ Technol, Dept Signal Proc, Korkeakoulunkatu 10, Tampere, Finland

[2] Univ Tampere, Sch Med, Kalevantie 4, Tampere, Finland

[3] Tampere Univ Hosp, Dept Pediat, Tampere, Finland

[4] Univ Tampere, Sch Social Sci & Humanities, Tampere, Finland

来源：

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING | 2018年

基金：

芬兰科学院; 新加坡国家研究基金会;

关键词：

Infant cry analysis; Acoustic analysis; Audio segmentation; Hidden Markov models; Model adaptation; AUDIO RECORDINGS; NEWBORN; PRETERM; CRIES; TERM; PHONATION; SPEECH;

D O I：

10.1186/s13636-018-0124-x

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Automatic extraction of acoustic regions of interest from recordings captured in realistic clinical environments is a necessary preprocessing step in any cry analysis system. In this study, we propose a hidden Markov model (HMM) based audio segmentation method to identify the relevant acoustic parts of the cry signal (i.e., expiratory and inspiratory phases) from recordings made in natural environments with various interfering acoustic sources. We examine and optimize the performance of the system by using different audio features and HMM topologies. In particular, we propose using fundamental frequency and aperiodicity features. We also propose a method for adapting the segmentation system trained on acoustic material captured in a particular acoustic environment to a different acoustic environment by using feature normalization and semi-supervised learning (SSL). The performance of the system was evaluated by analyzing a total of 3 h and 10 min of audio material from 109 infants, captured in a variety of recording conditions in hospital wards and clinics. The proposed system yields frame-based accuracy up to 89.2%. We conclude that the proposed system offers a solution for automated segmentation of cry signals in cry analysis applications.

引用

页数：14

共 46 条

[1] Expiratory and Inspiratory Cries Detection Using Different Signals' Decomposition Techniques [J].

Abou-Abbas, Lina ;

Tadj, Chakib ;

Gargour, Christian ;

Montazeri, Leila .

JOURNAL OF VOICE, 2017, 31 (02) :259.e13-259.e28

[2] Automatic detection of the expiratory and inspiratory phases in newborn cry signals [J].

Abou-Abbas, Lina ;

Alaie, Hesam Fersaie ;

Tadj, Chakib .

BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2015, 19 :35-43

[3]

[Anonymous], 2008, COMPUT SCI

[4]

[Anonymous], ENCY BIOMETRICS

[5]

[Anonymous], 2014, Audacity(R): Free Audio Editor and Recorder [Computer program]

[6]

[Anonymous], 2012, Proj. Rep. Dep. Electr. Eng. Tech. Isr. Inst. Technol. Haifa

[7] Segmentation of expiratory and inspiratory sounds in baby cry audio recordings using hidden Markov models [J].

Aucouturier, Jean-Julien ;

Nonaka, Yulri ;

Katahira, Kentaro ;

Okanoya, Kazuo .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2011, 130 (05) :2969-2977

[8] The newborn pain cry: Descriptive acoustic spectrographic analysis [J].

Branco, Anete ;

Fekete, Saskia M. W. ;

Rugolo, Ligia M. S. S. ;

Rehder, Maria Ines .

INTERNATIONAL JOURNAL OF PEDIATRIC OTORHINOLARYNGOLOGY, 2007, 71 (04) :539-546

[9] Acquired focal brain lesions in childhood: Effects on development and reorganization of language [J].

Chilosi, A. M. ;

Cipriani, P. ;

Pecini, C. ;

Brizzolara, D. ;

Biagi, L. ;

Montanaro, D. ;

Tosetti, M. ;

Cioni, G. .

BRAIN AND LANGUAGE, 2008, 106 (03) :211-225

[10]

CORWIN MJ, 1995, PEDIATRICS, V96, P73

← 1 2 3 4 5 →