The advantage of using an HMM-based approach for faxed word recognition

被引:30
作者
Elms A.J. [1 ,2 ]
Procter S. [1 ]
Illingworth J. [1 ]
机构
[1] School of Electronic Engineering, Information Technology and Mathematics, University of Surrey, Guildford
[2] Fujitsu Microelectronics Ltd., Maidenhead, Berkshire
关键词
Hidden Markov models; OCR; Word recognition;
D O I
10.1007/s100320050003
中图分类号
学科分类号
摘要
A method for word recognition based on the use of hidden Markov models (HMMs) is described. An evaluation of its performance is presented using a test set of real printed documents that have been subjected to severe photocopy and fax transmission distortions. A comparison with a commercial OCR package highlights the inherent advantages of a segmentation-free recognition strategy when the word images are severely distorted, as well as the importance of using contextual knowledge. The HMM method makes only one quarter of the number of word errors made by the commercial package when tested on word images taken from faxed pages. © 1998 Springer-Verlag Berlin Heidelberg.
引用
收藏
页码:18 / 36
页数:18
相关论文
共 34 条
[1]  
Agazzi O.E., Kuo S.-S., Hidden Markov model based optical character recognition in the presence of deterministic transformations, Pattern Recognition, 26, 12, pp. 1813-1826, (1993)
[2]  
Agazzi O.E., Kuo S.-S., Joint normalisation and recognition of degraded document images using pseudo-2D hidden Markov models, Proc. Int. Conf. Document Analysis and Recognition, pp. 155-158, (1993)
[3]  
Agazzi O.E., Kuo S.-S., Levin E., Peiraccini R., Connected and degraded text recognition using planar hidden Markov models, Proc. Int. Conf. Acoustics, Speech and Signal Processing, pp. 113-116, (1993)
[4]  
Bokser M., Omnidocument technologies, Proc. IEEE, 80, 7, pp. 1066-1078, (1992)
[5]  
Bose C., Kuo S.-S., Connected and degraded text recognition using hidden Markov model, Pattern Recognition, 27, 10, pp. 1345-1363, (1994)
[6]  
Bose C.B., Kuo S.-S., Connected and degraded text recognition using hidden Markov model, Proc. 11th Int. Conf. Pattern Recognition, pp. 116-119, (1992)
[7]  
Bunke H., Roth M., Schukat-Talamazzini E., Off-line recognition of cursive script produced by a cooperative writer, Proc. 12th Int. Conf. Pattern Recognition, pp. 383-386, (1994)
[8]  
Caesar T., Gloger J., Mandler E., Preprocessing and feature extraction for a handwriting recognition system, Proc. Int. Conf. Document Analysis and Recognition, pp. 408-411, (1993)
[9]  
Chen F.R., Wilcox L.D., Bloomberg D.S., Detecting and locating partially specified keywords in scanned images using hidden Markov models, Proc. Int. Conf. Document Analysis and Recognition, pp. 133-138, (1993)
[10]  
Chen F.R., Wilcox L.D., Bloomberg D.S., Word spotting in scanned images using hidden Markov models, Proc. Int. Conf. Acoustics, Speech and Signal Processing, pp. 1-4, (1993)