Lung sounds classification using convolutional neural networks

被引：122

作者：

Bardou, Dalal ^{[1
]}

Zhang, Kun ^{[1
]}

Ahmad, Sayed Mohammad ^{[2
]}

机构：

[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing, Jiangsu, Peoples R China

[2] Lareb Technol, Delhi, India

来源：

ARTIFICIAL INTELLIGENCE IN MEDICINE | 2018年 / 88卷

关键词：

Convolutional neural network; Lung sounds classification; Handcrafted features extraction; Deep learning; Models ensembling; Support vector machines; REAL-TIME ANALYSIS; DATA AUGMENTATION; CRACKLE; FREQUENCY; MODEL;

D O I：

10.1016/j.artmed.2018.04.008

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Lung sounds convey relevant information related to pulmonary disorders, and to evaluate patients with pulmonary conditions, the physician or the doctor uses the traditional auscultation technique. However, this technique suffers from limitations. For example, if the physician is not well trained, this may lead to a wrong diagnosis. Moreover, lung sounds are non-stationary, complicating the tasks of analysis, recognition, and distinction. This is why developing automatic recognition systems can help to deal with these limitations. In this paper, we compare three machine learning approaches for lung sounds classification. The first two approaches are based on the extraction of a set of handcrafted features trained by three different classifiers (support vector machines, k-nearest neighbor, and Gaussian mixture models) while the third approach is based on the design of convolutional neural networks (CNN). In the first approach, we extracted the 12 MFCC coefficients from the audio files then calculated six MFCCs statistics. We also experimented normalization using zero mean and unity variance to enhance accuracy. In the second approach, the local binary pattern (LBP) features are extracted from the visual representation of the audio files (spectrograms). The features are normalized using whitening. The dataset used in this work consists of seven classes (normal, coarse crackle, fine crackle, monophonic wheeze, polyphonic wheeze, squawk, and stridor). We have also experimentally tested dataset augmentation techniques on the spectrograms to enhance the ultimate accuracy of the CNN. The results show that CNN outperformed the handcrafted feature based classifiers. (C) 2018 Elsevier B.V. All rights reserved.

引用

页码：58 / 69

页数：12

共 63 条

[1] Convolutional Neural Networks for Speech Recognition [J].

Abdel-Hamid, Ossama ;

Mohamed, Abdel-Rahman ;

Jiang, Hui ;

Deng, Li ;

Penn, Gerald ;

Yu, Dong .

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (10) :1533-1545

[2] A French national research project to the creation of an auscultation's school: The ASAP project [J].

Andres, Emmanuel ;

Reichert, Sandra ;

Gass, Raymond ;

Brandt, Christian .

EUROPEAN JOURNAL OF INTERNAL MEDICINE, 2009, 20 (03) :323-327

[3]

[Anonymous], ENG MED BIOL SOC EMB

[4]

[Anonymous], P AUD MOSTL 2016

[5]

[Anonymous], NYTIMES 0227

[6]

[Anonymous], IM PROC ICIP 2010 17

[7]

[Anonymous], ENSEMBLE FEATURE BAS

[8]

[Anonymous], UNDERSTANDING DIFFIC

[9]

[Anonymous], INTERSPEECH

[10]

[Anonymous], BIOM CIRC SYST C BIO

← 1 2 3 4 5 6 7 →