Text line extraction from handwritten document pages using spiral run length smearing algorithm

被引:0
作者
Malakar, Samir [1 ]
Halder, Sougata [1 ]
Sarkar, Ram [2 ]
Das, Nibaran [2 ]
Basu, Subhadip [2 ]
Nasipuri, Mita [2 ]
机构
[1] MCKV Inst Engn, Dept Master Comp Applicat, Liluah, Howrah, India
[2] Jadavpur Univ, Dept Comp Sci & Engn, Kolkata 700032, West Bengal, India
来源
PROCEEDINGS OF THE 2012 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, DEVICES AND INTELLIGENT SYSTEMS (CODLS) | 2012年
关键词
Text line extraction; Handwritten document pages; CMATERdb; Vertical partitioning; SRLSA; OCR; SEGMENTATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Extraction of text lines from document images is one of the important steps in the process of an Optical Character Recognition (OCR) system. In case of handwritten document images, presence of skewed, touching or overlapping text line(s) makes this process a real challenge In the researcher. In the present work, a new text line extraction technique based on Spiral Run Length Smearing Algorithm (SRLSA) is reported. Firstly, digitized document image is partitioned into a number of vertical fragments of equal width. Then all the text line segments present in these fragments are identified by applying SRLSA. Finally, the neighboring text line segments are analyzed and merged (if necessary) to place them inside the same text line boundary in which they actually belong. For experimental purpose, the technique is tested on CMATERdb1.1.1 and CMATERdb1.2.1 data bases. The present technique extracts 87.09% and 89.35% text lines successfully from the said databases respectively.
引用
收藏
页码:616 / 619
页数:4
相关论文
共 12 条
  • [1] A new scheme for unconstrained handwritten text-line segmentation
    Alaei, Alireza
    Pal, Umapada
    Nagabhushan, P.
    [J]. PATTERN RECOGNITION, 2011, 44 (04) : 917 - 928
  • [2] Text line extraction from multi-skewed handwritten documents
    Basu, S.
    Chaudhuri, C.
    Kundu, M.
    Nasipuri, M.
    Basu, D. K.
    [J]. PATTERN RECOGNITION, 2007, 40 (06) : 1825 - 1839
  • [3] Du X., 2008, P INT C FRONT HANDWR, P253
  • [4] Khandelwal A, 2009, LECT NOTES COMPUT SC, V5909, P369, DOI 10.1007/978-3-642-11164-8_60
  • [5] Script-independent text line segmentation in freestyle handwritten documents
    Li, Yi
    Zheng, Yefeng
    Doermann, David
    Jaeger, Stefan
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008, 30 (08) : 1313 - 1329
  • [6] Text line detection in handwritten documents
    Louloudis, G.
    Gatos, B.
    Pratikakis, I.
    Halatsis, C.
    [J]. PATTERN RECOGNITION, 2008, 41 (12) : 3758 - 3772
  • [7] Text line and word segmentation of handwritten documents
    Louloudis, G.
    Gatos, B.
    Pratikakis, I.
    Halatsis, C.
    [J]. PATTERN RECOGNITION, 2009, 42 (12) : 3169 - 3183
  • [8] Roy P.P., 2008, Proceedings of the International Conference in Frontiers in Handwritten Recognition (ICFHR-08), August 19-21, 2008, Canada, P241
  • [9] Sarkar R., 2012, P CD 3 ICCCNT 12 26
  • [10] Sarkar R, 2009, P 4 IND INT C ART IN, P1861