Segmenting Characters from Malayalam Handwritten Documents

被引:0
作者
Hashrin, C. P. [1 ]
Jossy, Amal [1 ]
Sudhakaran, K. [1 ]
Thushara, A. [1 ]
John, Ansamma [1 ]
机构
[1] TKM Coll Engn, Dept Comp Sci & Engn, Kollam, Kerala, India
来源
PROCEEDINGS OF 2019 1ST INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION AND COMMUNICATION TECHNOLOGY (ICIICT 2019) | 2019年
关键词
OCR; segmentation; RECOGNITION;
D O I
10.1109/iciict1.2019.8741416
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Construction of an Optical Character Recognition (OCR) model for handwritten documents poses many challenges, the most prominent of them being dataset collection, character segmentation and classification. This paper focuses on the segmentation part, and presents a novel approach to segment individual characters from Malayalam handwritten documents. It is a three-stage approach where morphological operations, contour analysis, and bounding box detection are used to extract individual lines from the document, words from each line, and then characters from each word. An additional masking method is performed to tackle the overlapping of bounding boxes due to skewed lines and the presence of diacritics. The segmented characters can either be used to create datasets or fed to OCR models.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Segmentation of Overlapping and Touching Sinhala Handwritten Characters
    Walawage, K. S. A.
    Ranathunga, L.
    2018 3RD INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY RESEARCH (ICITR), 2018,
  • [42] Learning-based word spotting system for Arabic handwritten documents
    Khayyat, Muna
    Lam, Louisa
    Suen, Ching Y.
    PATTERN RECOGNITION, 2014, 47 (03) : 1021 - 1030
  • [43] Fuzzy technique based recognition of handwritten characters
    Suresh, RM
    Arumugam, S
    FUZZY LOGIC AND APPLICATIONS, 2006, 2955 : 297 - 308
  • [44] Neural network positioning and classification of handwritten characters
    Shustorovich, A
    Thrasher, CW
    NEURAL NETWORKS, 1996, 9 (04) : 685 - 693
  • [45] Fuzzy recognition of offline handwritten numeric characters
    Batuwita, K. B. M. R.
    Bandara, G. E. M. D. C.
    2006 IEEE CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2006, : 766 - +
  • [46] The Segmentation of Half Characters in Handwritten Hindi Text
    Garg, Naresh Kumar
    Kaur, Lakhwinder
    Jindal, M. K.
    INFORMATION SYSTEMS FOR INDIAN LANGUAGES, 2011, 139 : 48 - +
  • [47] Fuzzy technique based recognition of handwritten characters
    Suresh, R. M.
    Arumugam, S.
    IMAGE AND VISION COMPUTING, 2007, 25 (02) : 230 - 239
  • [48] DBAHCL: database for Arabic handwritten characters and ligatures
    Lamghari N.
    Raghay S.
    International Journal of Multimedia Information Retrieval, 2017, 6 (3) : 263 - 269
  • [49] Security Handwritten Documents Using Inner Product
    Syaifudin
    Pratiwi, Dian
    PROCEEDINGS OF SECOND INTERNATIONAL CONFERENCE ON ELECTRICAL SYSTEMS, TECHNOLOGY AND INFORMATION 2015 (ICESTI 2015), 2016, 365 : 501 - 509
  • [50] Localization Of Touching Letters In Arabic Handwritten Documents
    Nabil, Aouadi
    Echi, Afef Kacem
    Belaid, Abdel
    PROCEEDINGS OF 2016 15TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2016, : 501 - 506