An 'End-to-Evolution' Hybrid Approach for Snore Sound Classification

被引：18

作者：

Freitag, Michael ^{[1
]}

Amiriparian, Shahin ^{[1
,2
]}

Cummins, Nicholas ^{[1
]}

Gerczuk, Maurice ^{[1
]}

Schuller, Bjoern ^{[1
,3
]}

机构：

[1] Univ Passau, Chair Complex & Intelligent Syst, Passau, Germany

[2] Tech Univ Munich, Machine Intelligence & Signal Proc Grp, Munich, Germany

[3] Imperial Coll London, Machine Learning Grp, London, England

来源：

18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年

基金：

欧盟地平线“2020”;

关键词：

competitive swarm optimisation; evolutionary feature selection; convolutional neural network; snoring; computational paralinguistics; RECOGNITION; DECEPTION; SELECTION;

D O I：

10.21437/Interspeech.2017-173

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Whilst snoring itself is usually not harmful to a person's health, it can be an indication of Obstructive Sleep Apnoea (OSA), a serious sleep-related disorder. As a result, studies into using snoring as acoustic based marker of OSA are gaining in popularity. Motivated by this, the INTERSPEECH 2017 ComParE Snoring sub-challenge requires classification from which areas in the upper airways different snoring sounds originate. This paper explores a hybrid approach combining evolutionary feature selection based on competitive swarm optimisation and deep convolutional neural networks (CNN). Feature selection is applied to novel deep spectrum features extracted directly from spectrograms using pre-trained image classification CNN. Key results presented demonstrate that our hybrid approach can substantially increase the performance of a linear support vector machine on a set of low-level features extracted from the Snoring sub-challenge data. Even without subset selection, the deep spectrum features are sufficient to outperform the challenge baseline, and competitive swarm optimisation further improves system performance. In comparison to the challenge baseline, unweighted average recall is increased from 40.6 % to 57.6 % on the development partition, and from 58.5 % to 66.5 % on the test partition, using 2 246 of the 4 096 deep spectrum features.

引用

页码：3507 / 3511

页数：5

共 33 条

[11] A Competitive Swarm Optimizer for Large Scale Optimization [J].

Cheng, Ran ;

Jin, Yaochu .

IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (02) :191-204

[12]

Fan RE, 2008, J MACH LEARN RES, V9, P1871

[13] Feature selection for high-dimensional classification using a competitive swarm optimizer [J].

Gu, Shenkai ;

Cheng, Ran ;

Jin, Yaochu .

SOFT COMPUTING, 2018, 22 (03) :811-822

[14] Categorical and dimensional affect analysis in continuous input: Current trends and future directions [J].

Gunes, Hatice ;

Schuller, Bjoern .

IMAGE AND VISION COMPUTING, 2013, 31 (02) :120-136

[15]

Hall M., 2009, SIGKDD EXPLORATIONS, V11, P10, DOI [DOI 10.1145/1656274.1656278, 10.1145/1656274.1656278]

[16] Matplotlib: A 2D graphics environment [J].

Hunter, John D. .

COMPUTING IN SCIENCE & ENGINEERING, 2007, 9 (03) :90-95

[17]

Ivanov A, 2012, INT CONF ACOUST SPEE, P5125, DOI 10.1109/ICASSP.2012.6289074

[18] Caffe: Convolutional Architecture for Fast Feature Embedding [J].

Jia, Yangqing ;

Shelhamer, Evan ;

Donahue, Jeff ;

Karayev, Sergey ;

Long, Jonathan ;

Girshick, Ross ;

Guadarrama, Sergio ;

Darrell, Trevor .

PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, :675-678

[19]

Kaya Heysem., 2014, Proceedings of the ACM 4th International Workshop on Audio/Visual Emotion Challenge, P19, DOI DOI 10.1145/2661806.2661814

[20]

Kennedy J, 1995, 1995 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS PROCEEDINGS, VOLS 1-6, P1942, DOI 10.1109/icnn.1995.488968

← 1 2 3 4 →