EXTRACTION OF CHARACTERS FROM FORM DOCUMENTS BY FEATURE POINT CLUSTERING

被引:14
作者
FAN, KC [1 ]
LU, JM [1 ]
WANG, LS [1 ]
LIAO, HY [1 ]
机构
[1] ACAD SINICA,INST COMP SCI,TAIPEI 115,TAIWAN
关键词
DOCUMENT ANALYSIS; FEATURE POINT CLUSTERING; MAXIMIN CLUSTERING ALGORITHM;
D O I
10.1016/0167-8655(95)00040-N
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Among various kinds of documents, forms are the important types. The automatic processing of form documents is a problem which is essential to the advancement of office automation. The extraction of characters from form documents is a prerequisite for optical character recognition. In this paper, we will present a clustering-based technique for extracting characters from form documents. In this method, we treat the character extraction process as a pattern clustering problem. The feasibility of the novel method is demonstrated through experimenting various kinds of forms. Experimental results reveal the feasibility of the novel method.
引用
收藏
页码:963 / 970
页数:8
相关论文
共 10 条
[1]   A ROBUST ALGORITHM FOR TEXT STRING SEPARATION FROM MIXED TEXT GRAPHICS IMAGES [J].
FLETCHER, LA ;
KASTURI, R .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1988, 10 (06) :910-918
[2]   SKELETON GENERATION OF ENGINEERING DRAWINGS VIA CONTOUR MATCHING [J].
HAN, CC ;
FAN, KC .
PATTERN RECOGNITION, 1994, 27 (02) :261-275
[3]  
LAM SW, 1992, 2ND P INT C DOC AN R, P506
[4]  
Nagy G., 1984, Seventh International Conference on Pattern Recognition (Cat. No. 84CH2046-1), P347
[5]  
NAGY G, 1989, 5TH P INT C IM AN PR, P511
[6]  
SRIHARI SN, 1986, P ACM IEEE JOINT FAL, P87
[7]  
Tou J.T., 1974, PATTERN RECOGNITION
[8]  
WANG DC, 1991, 1ST P INT C DOC AN R, P181
[9]  
WONG KY, 1982, IBM J RES DEV, V6, P642
[10]  
YUAN JN, 1991, 1ST P INT C DOC AN R, P210