Speech phoneme and spectral smearing based non-invasive COVID-19 detection

被引：2

作者：

Mishra, Soumya ^{[1
]}

Dash, Tusar Kanti ^{[1
]}

Panda, Ganapati ^{[1
]}

机构：

[1] CV Raman Global Univ, Dept Elect & Commun Engn, Bhubaneswar, India

来源：

FRONTIERS IN ARTIFICIAL INTELLIGENCE | 2023年 / 5卷

关键词：

COVID-19; detection; machine learning; spectral smearing; phoneme analysis; NEURAL-NETWORKS; CLASSIFICATION; FEATURES; MUSIC; NOISE; TIME;

D O I：

10.3389/frai.2022.1035805

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

COVID-19 is a deadly viral infection that mainly affects the nasopharyngeal and oropharyngeal cavities before the lung in the human body. Early detection followed by immediate treatment can potentially reduce lung invasion and decrease fatality. Recently, several COVID-19 detections methods have been proposed using cough and breath sounds. However, very little study has been done on the use of phoneme analysis and the smearing of the audio signal in COVID-19 detection. In this paper, this problem has been addressed and the classification of speech samples has been carried out in COVID-19-positive and healthy audio samples. Additionally, the grouping of the phonemes based on reference classification accuracies have been proposed for effectiveness and faster detection of the disease at a primary stage. The Mel and Gammatone Cepstral coefficients and their derivatives are used as the features for five standard machine learning-based classifiers. It is observed that the generalized additive model provides the highest accuracy of 97.22% for the phoneme grouping "/t//r//n//g//l/." This smearing-based phoneme classification technique can also be used in the future to classify other speech-related disease detections.

引用

页数：12

共 51 条

[1] COVID-CT-MD, COVID-19 computed tomography scan dataset applicable in machine learning and deep learning [J].

Afshar, Parnian ;

Heidarian, Shahin ;

Enshaei, Nastaran ;

Naderkhani, Farnoosh ;

Rafiee, Moezedin Javad ;

Oikonomou, Anastasia ;

Fard, Faranak Babaki ;

Samimi, Kaveh ;

Plataniotis, Konstantinos N. ;

Mohammadi, Arash .

SCIENTIFIC DATA, 2021, 8 (01)

[2] An efficient voice pathology classification scheme based on applying multi-layer linear discriminant analysis to wavelet packet-based features [J].

Akbari, Ali ;

Arjmandi, Meisam Khalil .

BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2014, 10 :209-223

[3] Design of a DSP-based instrument for real-time classification of pulmonary sounds [J].

Alsmadi, Sameer ;

Kahya, Yasemin P. .

COMPUTERS IN BIOLOGY AND MEDICINE, 2008, 38 (01) :53-61

[4] DeepOCT: An explainable deep learning architecture to analyze macular edema on OCT images [J].

Altan, Gokhan .

ENGINEERING SCIENCE AND TECHNOLOGY-AN INTERNATIONAL JOURNAL-JESTECH, 2022, 34

[5] SecureDeepNet-IoT: A deep learning application for invasion detection in industrial Internet of Things sensing systems [J].

Altan, Gokhan .

TRANSACTIONS ON EMERGING TELECOMMUNICATIONS TECHNOLOGIES, 2021, 32 (04)

[6] Voice Quality Evaluation in Patients With COVID-19: An Acoustic Analysis [J].

Asiaee, Maral ;

Vahedian-azimi, Amir ;

Atashi, Seyed Shahab ;

Keramatfar, Abdalsamad ;

Nourbakhsh, Mandana .

JOURNAL OF VOICE, 2022, 36 (06) :879.e13-879.e19

[7] Effects of spectral smearing on phoneme and word recognition [J].

Boothroyd, A ;

Mulhearn, B ;

Gong, J ;

Ostroff, J .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1996, 100 (03) :1807-1818

[8]

Cheng O, 2005, ISSPA 2005: THE 8TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOLS 1 AND 2, PROCEEDINGS, P711

[9]

Croux C, 2008, STAT SINICA, V18, P581

[10]

Dash TK, 2019, J SCI IND RES INDIA, V78, P868

← 1 2 3 4 5 6 →