Enhancement of Speech Recognitions for Control Automation Using an Intelligent Particle Swarm Optimization

被引:31
作者
Chan, Kit Yan [1 ]
Yiu, Cedric K. F. [2 ]
Dillon, Tharam S. [1 ]
Nordholm, Sven [1 ]
Ling, Sai Ho [3 ]
机构
[1] Curtin Univ Technol, Dept Elect & Comp Engn, Perth, WA 6102, Australia
[2] Hong Kong Polytech Univ, Dept Appl Math, Hong Kong, Hong Kong, Peoples R China
[3] Univ Technol Sydney, Fac Engn & Informat Engn, Sydney, NSW 2007, Australia
关键词
Beamformer; intelligent fuzzy systems; particle swarm optimization; speech control; speech recognition; SYSTEM; ALGORITHMS;
D O I
10.1109/TII.2012.2187910
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
For over two decades, speech control mechanisms have been widely applied in manufacturing systems such as factory automation, warehouse automation, and industrial robotic control for over two decades. To implement speech controls, a commercial speech recognizer is used as the interface between users and the automation system. However, users' commands are often contaminated by environmental noise which degrades the performance of speech recognition for controlling automation systems. This paper presents a multichannel signal enhancement methodology to improve the performance of commercial speech recognizers. The proposed methodology aims to optimize speech recognition accuracy of a commercial speech recognizer in a noisy environment based on a beamformer, which is developed by an intelligent particle swarm optimization. It overcomes the limitation of the existing signal enhancement approaches whereby the parameters inside commercial speech recognizers are required to be tuned, which is impossible in a real-world situation. Also, it overcomes the limitation of the existing optimization algorithm including gradient descent methods, genetic algorithms and classical particle swarm optimization that are unlikely to develop optimal beamformers for maximizing speech recognition accuracy. The performance of the proposed methodology was evaluated by developing beamformers for a commercial speech recognizer, which was implemented on warehouse automation. Results indicate a significant improvement regarding speech recognition accuracy.
引用
收藏
页码:869 / 879
页数:11
相关论文
共 43 条
[1]  
Aberdeen Group Inc., 2007, WAR AUT WHATS REALL
[2]   A Fuzzy Method for Global Quality Index Evaluation of Solder Joints in Surface Mount Technology [J].
Acciani, Giuseppe ;
Fornarelli, Girolamo ;
Giaquinto, Antonio .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2011, 7 (01) :115-124
[3]   Integration of a Voice Recognition System in a Social Robot [J].
Alonso-Martin, F. ;
Salichs, Miguel A. .
CYBERNETICS AND SYSTEMS, 2011, 42 (04) :215-245
[4]  
[Anonymous], 2001, MICROPHONE ARRAYS SI
[5]  
[Anonymous], 1975, ANAL BEHAV CLASS GEN
[6]  
[Anonymous], 2007, P 2 INT C INN COMP I, DOI [DOI 10.1109/ICICIC.2007.209, DOI 10.1109/ICICIC.2007.2092-S2.0-39049112925]
[7]  
[Anonymous], 2010, MAK EL DEV TALK HEAR
[8]  
[Anonymous], 2007, Speech Enhancement: Theory and Practice
[9]   On the improved performances of the particle swarm optimization algorithms with adaptive parameters, cross-over operators and root mean square (RMS) variants for computing optimal control of a class of hybrid systems [J].
Arumugam, M. Senthil ;
Rao, M. V. C. .
APPLIED SOFT COMPUTING, 2008, 8 (01) :324-336
[10]  
Box G.E.P., 2005, Statistics for Experimenters. Design, Innovation, and Discovery, V2