A review on speech processing using machine learning paradigm

被引:29
作者
Bhangale, Kishor Barasu [1 ]
Mohanaprasad, K. [1 ]
机构
[1] VIT Univ, Sch Elect Engn SENSE, Chennai 600127, Tamil Nadu, India
关键词
Speech processing; Speech recognition; Machine learning; Speech feature extraction; Speech classification; Speech emotion recognition; INDEPENDENT COMPONENT ANALYSIS; SUPPORT VECTOR MACHINES; SPEAKER RECOGNITION; CLASSIFICATION; HMM; FEATURES; SHIMMER; MODELS; JITTER; ADAPTATION;
D O I
10.1007/s10772-021-09808-0
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Speech processing plays a crucial role in many signal processing applications, while the last decade has bought gigantic evolution based on machine learning prototype. Speech processing has a close relationship with computer linguistics, human-machine interaction, natural language processing, and psycholinguistics. This review article majorly discusses the feature extraction techniques and machine learning classifiers employed in speech processing and recognition activities. The performance of several machine learning techniques is validated for speech emotion recognition application on Berlin EmoDB database. Further, it gives the broad application areas and challenges in machine learning for speech processing.
引用
收藏
页码:367 / 388
页数:22
相关论文
共 155 条
[21]  
Bhangale KB., 2018, IOSR J Eng, V8, P55
[22]   A Pre-classification-Based Language Identification for Northeast Indian Languages Using Prosody and Spectral Features [J].
Bhanja, Chuya China ;
Laskar, Mohammad Azharuddin ;
Laskar, Rabul Hussain .
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2019, 38 (05) :2266-2296
[23]  
Bharali SS, 2017, 2017 2ND IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), P164, DOI 10.1109/WiSPNET.2017.8299740
[24]  
Bhardwaj K., 2019, 2019 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), P1
[25]   Speaker recognition: an enhanced approach to identify singer voice using neural network [J].
Biswas, Sharmila ;
Solanki, Sandeep Singh .
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (01) :9-21
[26]  
Burkhardt F., 2005, INTERSPEECH, V5, P1517
[27]  
Chan W, 2016, INT CONF ACOUST SPEE, P4960, DOI 10.1109/ICASSP.2016.7472621
[28]   Teager_Mel and PLP Fusion Feature Based Speech Emotion Recognition [J].
Chen, Xiao ;
Li, Haifeng ;
Ma, Lin ;
Liu, Xinlei ;
Chen, Jing .
2015 FIFTH INTERNATIONAL CONFERENCE ON INSTRUMENTATION AND MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC), 2015, :1109-1114
[29]  
Chittaragi Nagaratna B., 2020, Smart Computing Paradigms: New Progresses and Challenges. Proceedings of ICACNI 2018. Advances in Intelligent Systems and Computing (AISC 766), P131, DOI 10.1007/978-981-13-9683-0_14
[30]  
Chougala M, 2016, 2016 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, AND OPTIMIZATION TECHNIQUES (ICEEOT), P510, DOI 10.1109/ICEEOT.2016.7755666