Segmentation of touching characters in printed Devnagari and Bangla scripts using fuzzy, multifactorial analysis

被引:62
作者
Garain, U [1 ]
Chaudhuri, BB [1 ]
机构
[1] Indian Stat Inst, Comp Vis & Pattern Recognit Unit, Kolkata 700108, W Bengal, India
来源
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS | 2002年 / 32卷 / 04期
关键词
fuzzy decision making; indian script optical character recognition (OCR); multifactorial analysis; touching characters;
D O I
10.1109/TSMCC.2002.807272
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the important reasons for poor recognition rate in optical character recognition (OCR) system is the error in character segmentation. Existence of touching characters in the scanned documents is a major problem to design an effective character segmentation procedure. In this paper, a new technique is presented for identification and segmentation of touching characters. The technique is based on fuzzy multifactorial analysis. A predictive algorithm is developed for effectively selecting possible cut columns for segmenting the touching characters. The proposed method has been applied to printed documents in Devnagari and Bangla: the two most popular scripts of the Indian sub-continent. The results obtained from a test-set of considerable size show that a reasonable improvement in recognition rate can be achieved with a modest increase in computations.
引用
收藏
页码:449 / 459
页数:11
相关论文
共 27 条
[21]  
NARTER T, 1993, ISRI ANN REP
[22]   THE DOCUMENT SPECTRUM FOR PAGE LAYOUT ANALYSIS [J].
OGORMAN, L .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1993, 15 (11) :1162-1173
[23]   A SHAPE-ANALYSIS MODEL WITH APPLICATIONS TO A CHARACTER-RECOGNITION SYSTEM [J].
ROCHA, J ;
PAVLIDIS, T .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1994, 16 (04) :393-404
[24]   MAJOR COMPONENTS OF A COMPLETE TEXT READING SYSTEM [J].
TSUJIMOTO, S ;
ASADA, H .
PROCEEDINGS OF THE IEEE, 1992, 80 (07) :1133-1149
[25]  
Wang P. Z, 1982, FUZZY MATH, P45
[26]  
WANG PZ, 1990, FUZZY SET SYST, V36, P113, DOI 10.1016/0165-0114(90)90085-K
[27]   SKEW CORRECTION OF DOCUMENT IMAGES USING INTERLINE CROSS-CORRELATION [J].
YAN, H .
CVGIP-GRAPHICAL MODELS AND IMAGE PROCESSING, 1993, 55 (06) :538-543