Speaker identification based on combination of MFCC and UMRT based features

被引：8

作者：

Antony, Anett ^{[1
]}

Gopikakumari, R. ^{[1
]}

机构：

[1] CUSAT, Sch Engn, Div Elect Engn, Cochin 682022, Kerala, India

来源：

8TH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING & COMMUNICATIONS (ICACC-2018) | 2018年 / 143卷

关键词：

Speaker identification; MFCC; UMRT; ANN;

D O I：

10.1016/j.procs.2018.10.393

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

This paper introduces an isolated word speaker identification system based on a new feature extractor and using Artificial Neural Network. The system is designed for both text independent and text dependent speaker identification system for English words. The speech is recorded using audio wave recorder. Then the preprocessing is applied for the given speech signals. UMRT is a transform which has been used for image compression. Combinations of MFCC and UMRT are taken and are used as a feature extractor. The classification of the features is done using Multi-layer perceptron with back propagation algorithm. The accuracy is taken using confusion matrix. The accuracy achieved is around 97.91% for speech dependent systems while for speech independent system the accuracy is around 94.44%. (C) 2018 The Authors. Published by Elsevier B.V.

引用

页码：250 / 257

页数：8

共 8 条

[1] Speaker identification using multimodal neural networks and wavelet analysis
Almaadeed, Noor
Aggoun, Amar
Amira, Abbes
[J]. IET BIOMETRICS, 2015, 4 (01) : 18 - 28
[2] Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection
Cakir, Emre
Parascandolo, Giambattista
Heittola, Toni
Huttunen, Heikki
Virtanen, Tuomas
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (06) : 1291 - 1303
[3] Speaker Recognition Using Neural Networks and Conventional Classifiers
Farrell, Kevin R.
Mammone, Richard J.
Assaleh, Khaled T.
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (01): : 194 - 205
[4] Gopikakumari R., 1998, THESIS
[5] Text-Independent Speaker Identification Using the Histogram Transform Model
Ma, Zhanyu
Yu, Hong
Tan, Zheng-Hua
Guo, Jun
[J]. IEEE ACCESS, 2016, 4 : 9733 - 9739
[6] Roy R.C., 2009, THESIS
[7] PHONEME RECOGNITION USING TIME-DELAY NEURAL NETWORKS
WAIBEL, A
HANAZAWA, T
HINTON, G
SHIKANO, K
LANG, KJ
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1989, 37 (03): : 328 - 339
[8] Wu Zunjing, 2005, Tsinghua Science and Technology, V10, P158, DOI 10.1016/S1007-0214(05)70048-1

← 1 →