Voice recognition based on MFCC, SBC and Spectrograms

被引：4

作者：

Martinez Mascorro, Guillermo Arturo ^{[1
]}

Aguilar Torres, Gualberto ^{[2
]}

机构：

[1] Inst Politecn Nacl, Ciencias Ingn Microelect, Mexico City, DF, Mexico

[2] Inst Politecn Nacl, Secc Estudios Posgrad & Invest, ESIME Culhuacan, Mexico City, DF, Mexico

来源：

INGENIUS-REVISTA DE CIENCIA Y TECNOLOGIA | 2013年 / 10期

关键词：

Speech recognition with voice changes; Mel Frequency Cepstral Coefficients; Subband-Based Cepstral Parameters; Spectrogram; Support Vector Machine;

D O I：

10.17163/ings.n10.2013.02

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

One of the problems of the Automatic Speech Recognition systems is the voice's changes. Typically, a person can have voluntary and involuntary voice's changes and the system can get confused in these cases, also the changes could be natural and artificial. This paper proposes and recognition system with a parallel identification, using three different algorithms: MFCC, SBC and Spectrogram. Using a Support Vector Machine as a classifier, every algorithm gives a group of persons with the highest likelihood and, after an evaluation, the result is obtained. The aim of this paper is to take advantage of the three algorithms.

引用

页码：12 / 20

页数：9

共 50 条

[41] A Human Gait Classification Method Based on Radar Doppler Spectrograms [J].

Fok Hing Chi Tivive ;

Abdesselam Bouzerdoum ;

Moeness G. Amin .

EURASIP Journal on Advances in Signal Processing, 2010

[42] Acoustic Classification of Singing Insects Based on MFCC/LFCC Fusion [J].

Noda, Juan J. ;

Travieso-Gonzalez, Carlos M. ;

Sanchez-Rodriguez, David ;

Alonso-Hernandez, Jesus B. .

APPLIED SCIENCES-BASEL, 2019, 9 (19)

[43] Emotion Recognition from Variable-Length Speech Segments Using Deep Learning on Spectrograms [J].

Ma, Xi ;

Wu, Zhiyong ;

Jia, Jia ;

Xu, Mingxing ;

Meng, Helen ;

Cai, Lianhong .

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, :3683-3687

[44] A Universal Audio Steganalysis Scheme Based on Multiscale Spectrograms and DeepResNet [J].

Ren, Yanzhen ;

Liu, Dengkai ;

Liu, Chenyu ;

Xiong, Qiaochu ;

Fu, Jianming ;

Wang, Lina .

IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2023, 20 (01) :665-679

[45] Spectral-Temporal Receptive Fields and MFCC Balanced Feature Extraction for Noisy Speech Recognition [J].

Wang, Jia-Ching ;

Lin, Chang-Hong ;

Chen, En-Ting ;

Chang, Pao-Chi .

2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,

[46] Underwater Passive Target Classification based on β Variational Autoencoder and MFCC [J].

Sunilkumar, Adarsh ;

Joseph, Shamju K. ;

Kumar, Manoj K. .

2023 SENSOR SIGNAL PROCESSING FOR DEFENCE CONFERENCE, SSPD, 2023, :26-30

[47] Investigating voice features for Speech emotion recognition based on four kinds of machine learning methods [J].

Chen, Haiyan ;

Liu, Zheng ;

Kang, Xin ;

Nishide, Shun ;

Ren, Fuji .

PROCEEDINGS OF 2019 6TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2019, :195-199

[48] Auditory-Inspired Morphological Processing of Speech Spectrograms: Applications in Automatic Speech Recognition and Speech Enhancement [J].

Joyner Cadore ;

Francisco J. Valverde-Albacete ;

Ascensión Gallardo-Antolín ;

Carmen Peláez-Moreno .

Cognitive Computation, 2013, 5 :426-441

[49] Auditory-Inspired Morphological Processing of Speech Spectrograms: Applications in Automatic Speech Recognition and Speech Enhancement [J].

Cadore, Joyner ;

Valverde-Albacete, Francisco J. ;

Gallardo-Antolin, Ascension ;

Pelaez-Moreno, Carmen .

COGNITIVE COMPUTATION, 2013, 5 (04) :426-441

[50] The impact of MFCC, spectrogram, and Mel-Spectrogram on deep learning models for Amazigh speech recognition system [J].

Meryam Telmem ;

Naouar Laaidi ;

Hassan Satori .

International Journal of Speech Technology, 2025, 28 (1) :299-312

← 1 2 3 4 5 →