An Acoustic Feature-Based Deep Learning Model for Automatic Thai Vowel Pronunciation Recognition

被引:0
|
作者
Rukwong, Niyada [1 ]
Pongpinigpinyo, Sunee [1 ]
机构
[1] Silpakorn Univ, Fac Sci, Dept Comp, Amphoe Muang 73000, Nakhon Pathom, Thailand
来源
APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 13期
关键词
computer-assisted pronunciation training; convolutional neural networks; Thai vowels; speech recognition; mel spectrogram; mel frequency cepstral coefficients; BRITISH ENGLISH; SPEAKING RATE; LEARNERS; PERCEPTION; DIALECT; EXPERIENCE; DURATION; NETWORK; LENGTH;
D O I
10.3390/app12136595
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
For Thai vowel pronunciation, it is very important to know that when mispronunciation occurs, the meanings of words change completely. Thus, effective and standardized practice is essential to pronouncing words correctly as a native speaker. Since the COVID-19 pandemic, online learning has become increasingly popular. For example, an online pronunciation application system was introduced that has virtual teachers and an intelligent process of evaluating students that is similar to standardized training by a teacher in a real classroom. This research presents an online automatic computer-assisted pronunciation training (CAPT) using deep learning to recognize Thai vowels in speech. The automatic CAPT is developed to solve the inadequacy of instruction specialists and the complex vowel teaching process. It is a unique system that develops computer techniques integrated with linguistic theory. The deep learning model is the most significant part of recognizing vowels pronounced for the automatic CAPT. The major challenge in Thai vowel recognition is the correct identification of Thai vowels when spoken in real-world situations. A convolutional neural network (CNN), a deep learning model, is applied and developed in the classification of pronounced Thai vowels. A new dataset for Thai vowels was designed, collected, and examined by linguists. The result of an optimal CNN model with Mel spectrogram (MS) achieves the highest accuracy of 98.61%, compared with Mel frequency cepstral coefficients (MFCC) with the baseline long short-term memory (LSTM) model and MS with the baseline LSTM model have an accuracy of 94.44% and 90.00% respectively.
引用
收藏
页数:28
相关论文
共 50 条
  • [21] A traffic state recognition model based on feature map and deep learning
    Wang, Chun
    Zhang, Weihua
    Wu, Cong
    Hu, Heng
    Ding, Heng
    Zhu, Wenjia
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2022, 607
  • [22] Random Fourier Feature-Based Deep Learning for Wireless Communications
    Mitra, Rangeet
    Kaddoum, Georges
    IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2022, 8 (02) : 468 - 479
  • [23] Feature-based and Deep Learning-based Classification of Environmental Sound
    Jatturas, Chinnavat
    Chokkoedsakul, Sornsawan
    Ayudhya, Pisitpong Devahasting Na
    Pankaew, Sukit
    Sopavanit, Cherdkul
    Asdornwised, Widhyakorn
    2019 4TH IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - ASIA (IEEE ICCE-ASIA 2019), 2019, : 126 - 130
  • [24] Evaluation and generation of attachment concepts based upon a feature-based solid model using a feature-based recognition strategy
    Baxter, DH
    Gabriele, GA
    DESIGN RESEARCH - THEORIES, METHODOLOGIES, AND PRODUCT MODELLING, 2001, : 605 - 612
  • [25] A Feature-Based Machine Learning Agent for Automatic Rice and Weed Discrimination
    Cheng, Beibei
    Matson, Eric T.
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT I, 2015, 9119 : 517 - 527
  • [26] OPTICAL IMPLEMENTATION OF A FEATURE-BASED NEURAL NETWORK WITH APPLICATION TO AUTOMATIC TARGET RECOGNITION
    CHAO, TH
    STONER, WW
    APPLIED OPTICS, 1993, 32 (08): : 1359 - 1369
  • [27] Feature-based Noise Robust Speech Recognition on an Indonesian Language Automatic Speech Recognition System
    Satriawan, Cil Hardianto
    Lestari, Dessi Puji
    2014 International Conference on Electrical Engineering and Computer Science (ICEECS), 2014, : 42 - 46
  • [28] Automatic Feature Learning for Glaucoma Detection Based on Deep Learning
    Chen, Xiangyu
    Xu, Yanwu
    Yan, Shuicheng
    Wong, Damon Wing Kee
    Wong, Tien Yin
    Liu, Jiang
    MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION, PT III, 2015, 9351 : 669 - 677
  • [29] A Feature Fusion based Custom Deep Learning Model for Vehicle Make and Model Recognition
    Ghosh, Triyas
    Gayen, Soumyajit
    Maity, Sourajit
    Valenkova, Daria
    Sarkar, Ram
    2024 13TH MEDITERRANEAN CONFERENCE ON EMBEDDED COMPUTING, MECO 2024, 2024, : 246 - 249
  • [30] Accumulated Polar Feature-Based Deep Learning for Efficient and Lightweight Automatic Modulation Classification With Channel Compensation Mechanism
    Teng, Chieh-Fang
    Chou, Ching-Yao
    Chen, Chun-Hsiang
    Wu, An-Yeu
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2020, 69 (12) : 15472 - 15485