A speaker identification system using a model of artificial neural networks for an elevator application

被引:14
作者
Adami, AG
Barone, DAC
机构
[1] Univ Caxias Do Sul, Dept Informat, BR-95070560 Caxias Do Sul, RS, Brazil
[2] Univ Fed Rio Grande Sul, Inst Informat, BR-91000000 Porto Alegre, RS, Brazil
关键词
Elevators - Mathematical models - Multilayer neural networks - Pattern recognition systems - Security systems - Speech analysis - Speech coding;
D O I
10.1016/S0020-0255(01)00129-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a comparison of some features for speaker identification applied to a building security system. The features used in this paper are pitch, frequency formants, linear predictive coding (LPC) coefficients and cepstral coefficients computed from LPC. The comparison was based on a system for building security that uses the voice of the residents to control the access to the building. The system uses a model of artificial neural network called multi-layer perceptron (MLP) as a classifier. This paper shows that cepstral coefficients are more efficient than LPC coefficients for the security system. (C) 2001 Elsevier Science Inc. All rights reserved.
引用
收藏
页码:1 / 5
页数:5
相关论文
共 9 条
[1]  
BENNANNI Y, 1991, P IEEE INT JOINT C N
[2]  
Doddington G.R., 1985, P IEEE, V73
[3]   Speaker Recognition Using Neural Networks and Conventional Classifiers [J].
Farrell, Kevin R. ;
Mammone, Richard J. ;
Assaleh, Khaled T. .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (01) :194-205
[4]   CEPSTRAL ANALYSIS TECHNIQUE FOR AUTOMATIC SPEAKER VERIFICATION [J].
FURUI, S .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1981, 29 (02) :254-272
[5]  
FURUI S, 1973, ELECTRON COMMUN JPN, V56, P62
[6]  
Morgan D.P., 1991, Neural Networks and Speech Processing
[7]  
OGLESBY J, 1990, P INT C AC SPEECH SI, P261
[8]  
Rabiner L.R., 2010, Digital Processing of Speech Signals
[9]   Experimental Evaluation of Features for Robust Speaker Identification [J].
Reynolds, Douglas A. .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (04) :639-643