A binary-tree-based OCR technique for machine-printed characters

被引:7
作者
Gatos, B [1 ]
Papamarkos, N [1 ]
Chamzas, C [1 ]
机构
[1] DEMOCRITUS UNIV THRACE,DEPT ELECT & COMP ENGN,ELECT CIRCUITS ANAL LAB,GR-67100 XANTHI,GREECE
关键词
optical character recognition; feature extraction; matching; classification; binary features;
D O I
10.1016/S0952-1976(97)00013-4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper describes the structure of an optical character recognition (OCR) system for printed documents. This system is trained for Latin and Greek typewritten text, but it can be easily adapted to any typewritten character set. The proposed method is divided into two main stages. In the first stage suitable binary features are extracted most of which are independent of the scaling and rotation of the characters. After that, a binary tree classification technique is used, and an optimal tree classifier is constructed. In the second stage, the characters at the end-nodes of the binary tree ape classified by using a new template-matching technique. By setting a suitable threshold for the matching, a decision can be reached for the greatest part of the characters. For those characters that the binary tree cannot recognize with great confidence, a secondary minimum distance, classifier trained with the Zernike moments of the characters, is used Experimental results show that the performance of the proposed OCR system is high, and the recognition rate can exceed 99.5%. (C) 1997 Published by Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:403 / 412
页数:10
相关论文
共 24 条
[1]  
ABDELAZIM HY, 1989, P VLSI MICROELECTRON, P140
[2]   OMNIDOCUMENT TECHNOLOGIES [J].
BOKSER, M .
PROCEEDINGS OF THE IEEE, 1992, 80 (07) :1066-1078
[3]   DECISION TREE DESIGN USING A PROBABILISTIC MODEL [J].
CASEY, RG ;
NAGY, G .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1984, 30 (01) :93-99
[4]   OPTICAL CHARACTER-RECOGNITION BY THE METHOD OF MOMENTS [J].
CASH, GL ;
HATAMIAN, M .
COMPUTER VISION GRAPHICS AND IMAGE PROCESSING, 1987, 39 (03) :291-310
[5]   PRINTED CHARACTER PRECLASSIFICATION BASED ON WORD STRUCTURE [J].
DELUCA, PG ;
GISOTTI, A .
PATTERN RECOGNITION, 1991, 24 (07) :609-615
[6]  
GATOS B, 1995, 5 INT C ADV COMM CON
[7]  
GONZALEZ CG, 1987, DIGITAL IMAGE PROCES
[8]  
Impedovo S., 1991, International Journal of Pattern Recognition and Artificial Intelligence, V5, P1, DOI 10.1142/S0218001491000041
[9]   ON THE RECOGNITION OF PRINTED CHARACTERS OF ANY FONT AND SIZE [J].
KAHAN, S ;
PAVLIDIS, T ;
BAIRD, HS .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1987, 9 (02) :274-288
[10]   INVARIANT IMAGE RECOGNITION BY ZERNIKE MOMENTS [J].
KHOTANZAD, A ;
HONG, YH .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1990, 12 (05) :489-497