INTELLIGENT SPEECH RECOGNITION USING FRACTAL AMENDED GRASSHOPPER OPTIMIZATION ALGORITHM WITH DEEP LEARNING APPROACH

被引：0

作者：

Al-Anazi, Reema G. ^{[1
]}

Al-Dobaian, Abdullah Saad ^{[2
]}

Hassan, Asma Abbas ^{[3
]}

Almanea, Manar ^{[4
]}

Alghamdi, Ayman Ahmad ^{[5
]}

Asklany, Somia A. ^{[6
]}

Al Sultan, Hanan ^{[7
]}

Majdoubi, Jihen ^{[8
]}

机构：

[1] Princess Nourah bint Abdulrahman Univ, Coll Humanities & Social Sci, Dept Arab Language & Literature, POB 84428, Riyadh 11671, Saudi Arabia

[2] King Saud Univ, Coll Language Sci, Dept English Language, POB 145111, Riyadh, Saudi Arabia

[3] King Khalid Univ, Appl Coll Mahayil, Comp Sci Dept, Muhayel Aseer 62529, Saudi Arabia

[4] Imam Mohammad Ibn Saud Islamic Univ, Coll Languages & Translat, Dept English, Riyadh 11432, Saudi Arabia

[5] Umm Al qura Univ, Arab Language Inst, Dept Arab Teaching, Mecca, Saudi Arabia

[6] Turaif Northern Border Univ, Fac Sci & Arts, Dept Comp Sci & Informat Technol, Ar Ar 91431, Saudi Arabia

[7] King Faisal Univ, Dept English, Coll Arts, Al Hufuf, Saudi Arabia

[8] Majmaah Univ, Coll Sci & Humanities Alghat, Dept Comp Sci, Al Majmaah 11952, Saudi Arabia

来源：

FRACTALS-COMPLEX GEOMETRY PATTERNS AND SCALING IN NATURE AND SOCIETY | 2024年 / 32卷 / 09N10期

关键词：

Speech Recognition; Fractals Amended Grasshopper Optimization; Speech Signals; Deep Learning; Human-To-Machine Interaction; Brain-Like Computing Applications; NETWORK; MODEL;

D O I：

10.1142/S0218348X25400298

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

Humans prefer to convey information through speech utilizing similar language. Speech detection is the capability to recognize the spoken words of the speaking person. Recent work demonstrates the increased attention among researcher workers in this field specially in Brain-Like computing applications and emphasizes the real-world usability of speech for speaker recognition across different applications. Automatic speech recognition (ASR) is the method of identifying human speech and converting it into text. This study has gained much popularity in recent times. It is a crucial area of research for human-to-machine interaction. Pioneer methods are concerned with manual feature extraction and classical algorithms including Hidden Markov Models (HMM), Gaussian Mixture Model (GMM), and the Dynamic Time Warping (DTW) model. In recent years, neural networks, namely convolutional neural networks (CNN), recurrent neural networks (RNN), and Transformers, have been utilized in the context of ASR and reached outstanding performance over the past few years. This study introduces Intelligent Speech Recognition using the Fractal Amended Grasshopper Optimization Algorithm with Deep Learning (ISR-AGODL) approach. The presented ISR-AGODL technique correctly identifies and recognizes speech signals. In the ISR-AGODL technique, the speech signals are transformed into spectrograms. Besides, the features are derived using the deep convolutional neural networks (DCNN) model. Followed by the Fractals AGO technique is utilized for the choosing of hyperparameters. Finally, the recognition of speech signals is achieved using the extreme gradient boosting (XGBoost) model. The simulation outcomes of the ISR-AGODL method can be validated using a benchmark dataset. The experimental results of the ISR-AGODL method portrayed a superior accuracy outcome of 96.34% over other models.

引用

页数：12

共 22 条

[1] Direct measurement of the branching fraction for D+→(K)over-bar0 μ+νμ and determination of Γ(D0→K-μ+ νμ)/Γ (D+→(K)over-bar0 μ+ νμ)
Ablikim, M.
Bai, J. Z.
Ban, Y.
Cai, X.
Chen, H. F.
Chen, H. S.
Chen, H. X.
Chen, J. C.
Chen, Jin
Chen, Y. B.
Chu, Y. P.
Dai, Y. S.
Diao, L. Y.
Deng, Z. Y.
Dong, Q. F.
Du, S. X.
Fang, J.
Fang, S. S.
Fu, C. D.
Gao, C. S.
Gao, Y. N.
Gu, S. D.
Gu, Y. T.
Guo, Y. N.
He, K. L.
He, M.
Heng, Y. K.
Hou, J.
Hu, H. M.
Hu, J. H.
Hu, T.
Huang, X. T.
Ji, X. B.
Jiang, X. S.
Jiang, X. Y.
Jiao, J. B.
Jin, D. P.
Jin, S.
Lai, Y. F.
Li, G.
Li, H. B.
Li, J.
Li, R. Y.
Li, S. M.
Li, W. D.
Li, W. G.
Li, X. L.
Li, X. N.
Li, X. Q.
Liang, Y. F.
[J]. PHYSICS LETTERS B, 2007, 644 (01) : 20 - 24
[2] Two-Way Feature Extraction for Speech Emotion Recognition Using Deep Learning
Aggarwal, Apeksha
Srivastava, Akshat
Agarwal, Ajay
Chahal, Nidhi
Singh, Dilbag
Alnuaim, Abeer Ali
Alhadlaq, Aseel
Lee, Heung-No
[J]. SENSORS, 2022, 22 (06)
[3] Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers
Akcay, Mehmet Berkehan
Oguz, Kaya
[J]. SPEECH COMMUNICATION, 2020, 116 (116) : 56 - 76
[4] An X. D., 2021, J PHYS C SER, V1861, P1
[5] Speaker identification and localization using shuffled MFCC features and deep learning
Barhoush M.
Hallawa A.
Schmeink A.
[J]. International Journal of Speech Technology, 2023, 26 (01) : 185 - 196
[6] Basu Saikat, 2017, 2017 2nd International Conference on Communication and Electronics Systems (ICCES). Proceedings, P333, DOI 10.1109/CESYS.2017.8321292
[7] 3-D Convolutional Recurrent Neural Networks With Attention Model for Speech Emotion Recognition
Chen, Mingyi
He, Xuanji
Yang, Jing
Zhang, Han
[J]. IEEE SIGNAL PROCESSING LETTERS, 2018, 25 (10) : 1440 - 1444
[8] A Novel Approach for Classification of Speech Emotions Based on Deep and Acoustic Features
Er, Mehmet Bilal
[J]. IEEE ACCESS, 2020, 8 : 221640 - 221653
[9] ISNet: Individual Standardization Network for Speech Emotion Recognition
Fan, Weiquan
Xu, Xiangmin
Cai, Bolun
Xing, Xiaofen
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1803 - 1814
[10] Metaheuristic optimization based- ensemble learners for the carbonation assessment of recycled aggregate concrete
Golafshani, Emadaldin Mohammadi
Behnood, Ali
Kim, Taehwan
Ngo, Tuan
Kashani, Alireza
[J]. APPLIED SOFT COMPUTING, 2024, 159

← 1 2 3 →