Text line segmentation in indian ancient handwritten documents using faster R-CNN

被引:16
作者
Jindal, Amar [1 ]
Ghosh, Rajib [1 ]
机构
[1] Natl Inst Technol Patna, Dept Comp Sci & Engn, Patna 800005, Bihar, India
关键词
Text line segmentation; Ancient documents; Devanagari script; Faster R-CNN; WORD RECOGNITION; DEVANAGARI;
D O I
10.1007/s11042-022-13709-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Textline segmentation in ancient handwritten documents is still considered as a challenging task in document analysis and recognition field even though various rule-based methods exist. These methods succeed under constraint such as a roughly uniform background. They do not contribute well in case of variable inter-line spacing and overlapping characters. This article proposes faster region-convolution neural network (R-CNN) based robust method to segment the textlines in the ancient handwritten document in Devanagari script for the first time in literature. The feature matrix has been generated by residual network and proposals have been predicted through the region proposal network (RPN). A pooling layer has been used to extract regions of interest, known as region of interest pooling layer, to locate the textlines. The performance of the proposed textline segmentation system has been evaluated on self generated dataset of ancient handwritten documents in Devanagari script and it has achieved the f-measure of 99.98%. Experimental results demonstrate that the proposed system outperforms the existing state-of-the-art methods of textline segmentation.
引用
收藏
页码:10703 / 10722
页数:20
相关论文
共 33 条
  • [1] A Multilevel Text line Segmentation Framework for Handwritten Historical Documents
    Ben Messaoud, Ines
    Amiri, Hamid
    El Abed, Haikal
    Maergner, Volker
    [J]. 13TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2012), 2012, : 515 - 520
  • [2] Handwriting Recognition of Historical Documents with few labeled data
    Chammas, Edgard
    Mokbel, Chafic
    Likforman-Sulem, Laurence
    [J]. 2018 13TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS (DAS), 2018, : 43 - 48
  • [3] CHOLLET F, 2017, PROC CVPR IEEE, P1800, DOI DOI 10.1109/CVPR.2017.195
  • [4] Garz A., 2012, Proceedings of the 10th IAPR International Workshop on Document Analysis Systems (DAS 2012), P95, DOI 10.1109/DAS.2012.23
  • [5] A Recurrent Neural Network based deep learning model for offline signature verification and recognition system
    Ghosh, Rajib
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 168
  • [6] RNN based online handwritten word recognition in Devanagari and Bengali scripts using horizontal zoning
    Ghosh, Rajib
    Vamshi, Chirumavila
    Kumar, Prabhat
    [J]. PATTERN RECOGNITION, 2019, 92 : 203 - 218
  • [7] RNN Based Online Handwritten Word Recognition in Devanagari Script
    Ghosh, Rajib
    Keshri, Pooja
    Kumar, Prabhat
    [J]. PROCEEDINGS 2018 16TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2018, : 517 - 522
  • [8] A two-stage method for text line detection in historical documents
    Gruening, Tobias
    Leifert, Gundram
    Strauss, Tobias
    Michael, Johannes
    Labahn, Roger
    [J]. INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2019, 22 (03) : 285 - 302
  • [9] OCR binarization and image pre-processing for searching historical documents
    Gupta, Maya R.
    Jacobson, Nathaniel P.
    Garcia, Eric K.
    [J]. PATTERN RECOGNITION, 2007, 40 (02) : 389 - 397
  • [10] He KM, 2020, IEEE T PATTERN ANAL, V42, P386, DOI [10.1109/ICCV.2017.322, 10.1109/TPAMI.2018.2844175]