Feature Set Optimisation for Infant Cry Classification

被引:1
作者
Vignolo, Leandro D. [1 ,2 ]
Marcelo Albornoz, Enrique [1 ,2 ]
Ernesto Martinez, Cesar [1 ,3 ]
机构
[1] Univ Nacl Litoral CC217, Fac Ingn & Cs Hidr, Res Inst Signals Syst & Computat Intelligence Sin, Ciudad Univ,S3000, Paraje El Pozo, Santa Fe, Argentina
[2] Consejo Nacl Invest Cient & Tecn, Consejo Nacl Invest Cient & Tecn, Buenos Aires, DF, Argentina
[3] Univ Nacl Entre Rios, Fac Ingn, Lab Cibernet, Entre Rios, Argentina
来源
ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2018 | 2018年 / 11238卷
关键词
Evolutionary algorithms; Features optimization; Crying classification; FEATURE-SELECTION; SPEECH; RECOGNITION; COEFFICIENTS; EXTRACTION;
D O I
10.1007/978-3-030-03928-8_37
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work deals with the development of features for the automatic classification of infant cry, considering three categories: neutral, fussing and crying vocalisations. Mel-frequency cepstral coefficients, together with standard functional obtained from these, have long been the most widely used features for all kind of speech-related tasks, including infant cry classification. However, recent works have introduced alternative filter banks leading to performance improvements and increased robustness. In this work, the optimisation of a filter bank is proposed for feature extraction and two other spectrum-based feature sets are compared. The first set of features is obtained through the optimisation of filter banks, by means of an evolutionary algorithm, in order to find a more suitable speech representation for the infant cry classification. Moreover, the classification performance of the optimised representation combined with other spectral features based on the mean log-spectrum and auditory spectrum is evaluated. The results show that these feature sets are able to improve the performance for the cry classification task.
引用
收藏
页码:455 / 466
页数:12
相关论文
共 37 条
[1]   A fully automated approach for baby cry signal segmentation and boundary detection of expiratory and inspiratory episodes [J].
Abou-Abbas, Lina ;
Tadj, Chakib ;
Fersaie, Hesam Alaie .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2017, 142 (03) :1318-1331
[2]   Filterbank optimization for robust ASR using GA and PSO [J].
Aggarwal, R.K. ;
Dave, M. .
International Journal of Speech Technology, 2012, 15 (02) :191-201
[3]  
Ahmad Z, 2015, AQUANANOTECHNOLOGY: GLOBAL PROSPECTS, P1
[4]   Feature extraction based on bio-inspired model for robust emotion recognition [J].
Albornoz, Enrique M. ;
Milone, Diego H. ;
Rufiner, Hugo L. .
SOFT COMPUTING, 2017, 21 (17) :5145-5158
[5]   Spoken emotion recognition using hierarchical classifiers [J].
Albornoz, Enrique M. ;
Milone, Diego H. ;
Rufiner, Hugo L. .
COMPUTER SPEECH AND LANGUAGE, 2011, 25 (03) :556-570
[6]   Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011 [J].
Anagnostopoulos, Christos-Nikolaos ;
Iliou, Theodoros ;
Giannoukos, Ioannis .
ARTIFICIAL INTELLIGENCE REVIEW, 2015, 43 (02) :155-177
[7]  
[Anonymous], COMPUTATIONAL PARALI
[8]  
Arora V, 2016, PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, P269
[9]  
Chen YW, 2006, STUD FUZZ SOFT COMP, V207, P315
[10]   COMPARISON OF PARAMETRIC REPRESENTATIONS FOR MONOSYLLABIC WORD RECOGNITION IN CONTINUOUSLY SPOKEN SENTENCES [J].
DAVIS, SB ;
MERMELSTEIN, P .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (04) :357-366