Deep Learning Assisted Neonatal Cry Classification via Support Vector Machine Models

被引:36
作者
Ashwini, K. [1 ]
Vincent, P. M. Durai Raj [1 ]
Srinivasan, Kathiravan [1 ]
Chang, Chuan-Yu [2 ]
机构
[1] Vellore Inst Technol VIT, Sch Informat Technol & Engn, Vellore, Tamil Nadu, India
[2] Natl Yunlin Univ Sci & Technol, Dept Comp Sci & Informat Engn, Touliu, Yunlin, Taiwan
关键词
convolutional neural network; infant cry classification; short time fourier transform; support vector machine; spectrogram; NEURAL-NETWORKS; SYSTEM;
D O I
10.3389/fpubh.2021.670352
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Neonatal infants communicate with us through cries. The infant cry signals have distinct patterns depending on the purpose of the cries. Preprocessing, feature extraction, and feature selection need expert attention and take much effort in audio signals in recent days. In deep learning techniques, it automatically extracts and selects the most important features. For this, it requires an enormous amount of data for effective classification. This work mainly discriminates the neonatal cries into pain, hunger, and sleepiness. The neonatal cry auditory signals are transformed into a spectrogram image by utilizing the short-time Fourier transform (STFT) technique. The deep convolutional neural network (DCNN) technique takes the spectrogram images for input. The features are obtained from the convolutional neural network and are passed to the support vector machine (SVM) classifier. Machine learning technique classifies neonatal cries. This work combines the advantages of machine learning and deep learning techniques to get the best results even with a moderate number of data samples. The experimental result shows that CNN-based feature extraction and SVM classifier provides promising results. While comparing the SVM-based kernel techniques, namely radial basis function (RBF), linear and polynomial, it is found that SVM-RBF provides the highest accuracy of kernel-based infant cry classification system provides 88.89% accuracy.
引用
收藏
页数:10
相关论文
共 29 条
[1]  
Alam S, 2016, INT CONF UBIQ FUTUR, P987, DOI 10.1109/ICUFN.2016.7536945
[2]   Automatic classification of infant vocalization sequences with convolutional neural networks [J].
Anders, Franz ;
Hlawitschka, Mario ;
Fuchs, Mirco .
SPEECH COMMUNICATION, 2020, 119 :36-45
[3]  
Ashwini K., 2022, RECENT ADV COMPUT SC, V15, P229, DOI [10.2174/2666255813999200710135408, DOI 10.2174/2666255813999200710135408]
[4]  
Bano S, 2015, PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON SOFT-COMPUTING AND NETWORKS SECURITY (ICSNS 2015)
[5]   DAG-SVM based infant cry classification system using sequential forward floating feature selection [J].
Chang, Chuan-Yu ;
Chang, Chuan-Wang ;
Kathiravan, S. ;
Lin, Chen ;
Chen, Szu-Ta .
MULTIDIMENSIONAL SYSTEMS AND SIGNAL PROCESSING, 2017, 28 (03) :961-976
[6]  
Chen Szu-Ta., 2017, Big Data Analytics for Sensor-Network Collected Intelligence, Intelligent Data-Centric Systems, P205, DOI [DOI 10.1016/B978-0-12-809393-1.00010-6, 10.1016/B978-0-12-809393-1.00010-6]
[7]   Inversion of Auditory Spectrograms, Traditional Spectrograms, and Other Envelope Representations [J].
Decorsiere, Remi ;
Sondergaard, Peter L. ;
MacDonald, Ewen N. ;
Dau, Torsten .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (01) :46-56
[8]  
Dewi SP, 2019, 2019 IEEE INTERNATIONAL CONFERENCE ON SIGNALS AND SYSTEMS (ICSIGSYS), P18, DOI [10.1109/icsigsys.2019.8811070, 10.1109/ICSIGSYS.2019.8811070]
[9]  
Felipe GZ, 2019, INT CONF SYST SIGNAL, P181, DOI 10.1109/IWSSIP.2019.8787318
[10]   Time-Frequency Filtering Based on Spectrogram Zeros [J].
Flandrin, Patrick .
IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (11) :2137-2141