A CNN-GRU APPROACH TO CAPTURE TIME-FREQUENCY PATTERN INTERDEPENDENCE FOR SNORE SOUND CLASSIFICATION

被引：0

作者：

Wang, Jianhong ^{[1
]}

Stromfelt, Harald ^{[1
]}

Schuller, Bjoern W. ^{[1
,2
]}

机构：

[1] Imperial Coll London, Dept Comp, London, England

[2] Univ Augsburg, Chair Embedded Intelligence Hlth Care & Wellbeing, Augsburg, Germany

来源：

2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO) | 2018年

关键词：

DualConvGRU Network; Dual Convolutional Layers; Channel Slice Model; EMOTION; RECOGNITION; DECEPTION;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this work, we propose an architecture named DualConvGRU Network to overcome the INTERPEECH 2017 ComParE Snoring sub-challenge. In this network, we devise two new models: the Dual Convolutional Layer, which is applied to a spectrogram to extract features; and the Channel Slice Model, which reprocess the extracted features. The first amalgamates an ensemble of information collected from two types of convolutional operations, with differing kernel dimension on the frequency axis and equal dimension on the time axis. Secondly, the dependencies within the convolutional layer channel axes are learnt, by feeding channel slices into a Gated Recurrent Unit (GRU) layer. By taking this approach, convolutional layers can be connected to sequential models without the use of fully connected layers. Compared with other state-of-the-art methods delivered to INTERPEECH 2017 ComParE Snoring sub-challenge, our method ranks 5th on performance of test data. Moreover, we are the only competitor to train a deep learning model solely on the provided training data, except for Baseline. The performance of our model exceeds the baseline too much.

引用

页码：997 / 1001

页数：5

共 26 条

[1] Snore Sound Classification Using Image-based Deep Spectrum Features
Amiriparian, Shahin
Gerczuk, Maurice
Ottl, Sandra
Cummins, Nicholas
Freitag, Michael
Pugachevskiy, Sergey
Baird, Alice
Schuller, Bjoern
[J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3512 - 3516
[2] Is deception emotional? An emotion-driven predictive approach
Amiriparian, Shahin
Pohjalainen, Jouni
Marchi, Erik
Pugachevskiy, Sergey
Schuller, Bjorn
[J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2011 - 2015
[3] Dimension Reduction: A Guided Tour
Burges, Christopher J. C.
[J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2010, 2 (04): : 275 - 365
[4] A Competitive Swarm Optimizer for Large Scale Optimization
Cheng, Ran
Jin, Yaochu
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (02) : 191 - 204
[5] Eyben F., 2010, P 18 ACM INT C MULT, P1459
[6] Acoustic analysis of snoring sound in patients with simple snoring and obstructive sleep apnoea
Fiz, JA
Abad, J
Jane, R
Riera, M
Mananas, MA
Caminal, P
Rodenstein, D
Morera, J
[J]. EUROPEAN RESPIRATORY JOURNAL, 1996, 9 (11) : 2365 - 2370
[7] Freitag M., 2017, P INTERSPEECH STOCKH, P3508
[8] DNN-based Feature Extraction and Classifier Combination for Child-Directed Speech, Cold and Snoring Identification
Gosztolya, Gabor
Busa-Fekete, Robert
Grosz, Mantis
Toth, Laszlo
[J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3522 - 3526
[9] Herath DL, 2013, IEEE ENG MED BIO, P3961, DOI 10.1109/EMBC.2013.6610412
[10] Ivanov A, 2012, INT CONF ACOUST SPEE, P5125, DOI 10.1109/ICASSP.2012.6289074

← 1 2 3 →