Mispronunciation Detection Using Deep Convolutional Neural Network Features and Transfer Learning-Based Model for Arabic Phonemes

被引：25

作者：

Nazir, Faria ^{[1
]}

Majeed, Muhammad Nadeem ^{[1
]}

Ghazanfar, Mustansar Ali ^{[1
]}

Maqsood, Muazzam ^{[2
]}

机构：

[1] Univ Engn & Technol Taxila, Dept Software Engn, Taxila 47050, Pakistan

[2] COMSATS Univ Islamabad, Dept Comp Sci, Attock Campus, Attock 43600, Pakistan

来源：

IEEE ACCESS | 2019年 / 7卷

关键词：

Mispronunciation detection; deep convolutional neural network; computer-assisted language learning; transfer learning; DETECTION SYSTEM;

D O I：

10.1109/ACCESS.2019.2912648

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Computer-assisted language learning (CALL) systems provide an automated framework to identify mispronunciation and give useful feedback. Traditionally, handcrafted acoustic-phonetic features are used to detect mispronunciation. From this line of research, this paper investigates the use of the deep convolutional neural network for mispronunciation detection of Arabic phonemes. We propose two methods with different techniques, i.e., convolutional neural network features (CNN_Features)-based technique and a transfer learning-based technique to detect mispronunciation detection. In the first method, we use deep CNN features to detect mispronunciation. We also extract features from different layers of CNN (layer4 to layer7) to train k-nearest neighbor (KNN), support vector machine (SVM), and neural network (NN) classifiers. In the transfer learning-based method, we trained the CNN using transfer learning to detect mispronunciation. To evaluate the performance of the system, we compare the results of these methods with baseline handcrafted features-based method for 28 Arabic phonemes. In the baseline method, we use the same classifiers; KNN, SVM, and NN to detect mispronunciation. The experimental results show that handcrafted_features method, CNN_features, and transfer learning-based method achieve an accuracy of 82%, 91.7%, and 92.2%, respectively. The performance analysis shows that transfer learning-based method outperforms handcrafted_features and transfer CNN_features-based methods and achieve an accuracy of 92.2%. The proposed transfer learning-based method also outperforms the state-of-art techniques in term of accuracy.

引用

页码：52589 / 52608

页数：20

共 45 条

[41] Improved stability criteria for switched positive linear systems with average dwell time switching [J].

Yin, Yunfei ;

Zong, Guangdeng ;

Zhao, Xudong .

JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2017, 354 (08) :3472-3484

[42]

Zhang F, 2008, INT CONF ACOUST SPEE, P5077

[43] Fuzzy-Approximation-Based Adaptive Output-Feedback Control for Uncertain Nonsmooth Nonlinear Systems [J].

Zhao, Xudong ;

Wang, Xinyong ;

Zong, Guangdeng ;

Li, Hongmin .

IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2018, 26 (06) :3847-3859

[44]

SPEECH COMMUN, V30, P95

[45]

SPEECH COMMUN, V30, P83

← 1 2 3 4 5 →