Seam carving, horizontal projection profile and contour tracing for line and word segmentation of language independent handwritten documents

被引:2
作者
Das, Mamatarani [1 ]
Panda, Mrutyunjaya [1 ]
机构
[1] Utkal Univ, Dept Comp Sci & Applict, Bhubaneswar, Odisha, India
关键词
Line segmentation; Word segmentation; Seam carving; Horizontal projection profile; Handwritten documents; Connected components;
D O I
10.1016/j.rineng.2023.101110
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Handwritten documents are, as always, highly challenging for recognition tasks compared to printed documents. Rather than using isolated characters as elementary components for recognition, practical documents use words or character strings. In any handwritten recognition task, the segmentation of lines and words plays a pivotal role as the outputs produced at this stage can drastically affect the performance and the results of the recognition tasks. An approach combining two distinct techniques, namely horizontal projection profile and seam carving, for the segmentation of lines has been proposed in this paper. Using the horizontal projection profile method, a general idea of the location of lines in the document is obtained first, but since only using the horizontal pro-jection profile method works better for printed documents, it is not enough for handwritten documents, so the seam carving method is applied to finely segment the lines, where line separation distance varies from writer to writer. Dynamic programming is used to create an energy matrix from the input image and determine the minimum energy paths from left to right. For word segmentation, contour points are traced before applying the seam carving algorithm to find possible paths, and paths that are intersecting with the characters of the text are removed. The standard publicly available IAM English handwritten dataset and the Bangla Writing dataset are used to analyse the text-line and line-word segmentation technique, and the results show promising recognition accuracy.
引用
收藏
页数:11
相关论文
共 30 条
  • [1] A new scheme for unconstrained handwritten text-line segmentation
    Alaei, Alireza
    Pal, Umapada
    Nagabhushan, P.
    [J]. PATTERN RECOGNITION, 2011, 44 (04) : 917 - 928
  • [2] Seam Carving for Text Line Extraction on Color and Grayscale Historical Manuscripts
    Arvanitopoulos, Nikolaos
    Suesstrunk, Sabine
    [J]. 2014 14TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2014, : 726 - 731
  • [3] Avidan S., Seam Carving for Content-Aware Image Resizing
  • [4] Distance transform based text-line extraction from unconstrained handwritten document images
    Bera, Suman Kumar
    Kundu, Soumyadeep
    Kumar, Neeraj
    Sarkar, Ram
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 186
  • [5] Biswas B., 2017, ADV INTELLIGENT SYST, V460, P107, DOI [10.1007/978-981-10-2107-7_10, DOI 10.1007/978-981-10-2107-7_10]
  • [6] Das Mamatarani, 2021, Innovations in Bio-Inspired Computing and Applications. Proceedings of the 10th International Conference on Innovations in Bio-Inspired Computing and Applications (IBICA 2019). Advances in Intelligent Systems and Computing (AISC 1180), P27, DOI 10.1007/978-3-030-49339-4_4
  • [7] Text line segmentation in handwritten documents using Mumford-Shah model
    Du, Xiaojun
    Pan, Wumo
    Bui, Tien D.
    [J]. PATTERN RECOGNITION, 2009, 42 (12) : 3136 - 3145
  • [8] Garg Naresh Kumar, 2010, Proceedings of the Seventh International Conference on Information Technology: New Generations (ITNG 2010), P392, DOI 10.1109/ITNG.2010.89
  • [9] Content Independent Writer Identification on Bangla Script: A Document Level Approach
    Halder, Chayan
    Obaidullah, Sk. Md.
    Santosh, K. C.
    Roy, Kaushik
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2018, 32 (09)
  • [10] Kar Aradhana, 2021, 2021 19th OITS International Conference on Information Technology (OCIT)., P54, DOI 10.1109/OCIT53463.2021.00022