Segmentation-free Word Spotting in Historical Bangla Handwritten Binarized Document

被引:0
作者
Das, Sugata [1 ]
Mandal, Sekhar [1 ]
机构
[1] Indian Inst Engn Sci & Technol, Dept Comp Sci & Technol, Sibpur, India
来源
2017 NINTH INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION (ICAPR) | 2017年
关键词
segmentation-free word spotting; SIFT keypoint detector; HOG features; Normalized Cross Correlation; Cosine distance; RETRIEVAL;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Content-Based Image Retrieval (CBIR) for historical handwritten documents is more challenging due to the large variety of writing style and degradation of historical manuscripts due to ageing. In this paper, we propose a segmentation-free word spotting method for historical handwritten binarized documents. The query word and the document image are converted into gray-scale images using distance transform followed by Gaussian smoothing. SIFT detector is used to locate the keypoints in both the query word and the document image. Histogram of Oriented Gradient (HOG) feature vector is used to describe each keypoint. We use an efficient search technique which calculates distance between query-word and the word (or part of a word) present in document image to spot the zone of interest in the document. The proposed method is tested on three historical handwritten Bengali data-sets and one historical English handwritten data-set. The performance is measured using standard evaluation metric which shows the efficiency of the proposed method.
引用
收藏
页码:76 / 81
页数:6
相关论文
共 22 条
  • [1] Segmentation-free word spotting with exemplar SVMs
    Almazan, Jon
    Gordo, Albert
    Fornes, Alicia
    Valveny, Ernest
    [J]. PATTERN RECOGNITION, 2014, 47 (12) : 3967 - 3978
  • [2] A complete printed Bangla OCR system
    Chaudhuri, BB
    Pal, U
    [J]. PATTERN RECOGNITION, 1998, 31 (05) : 531 - 549
  • [3] Csurka G., 2004, WORKSH STAT LEARN CO, V1, P1, DOI DOI 10.1234/12345678
  • [4] Histograms of oriented gradients for human detection
    Dalal, N
    Triggs, B
    [J]. 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, : 886 - 893
  • [5] Duk-Ryong Lee, 2012, Proceedings of the 2012 IEEE Southwest Symposium on Image Analysis & Interpretation (SSIAI 2012), P65, DOI 10.1109/SSIAI.2012.6202454
  • [6] Fischer A., 2010, Proceedings of the 2010 20th International Conference on Pattern Recognition (ICPR 2010), P3416, DOI 10.1109/ICPR.2010.834
  • [7] Part-Structured Inkball Models for One-Shot Handwritten Word Spotting
    Howe, Nicholas R.
    [J]. 2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 582 - 586
  • [8] A word spotting framework for historical machine-printed documents
    Kesidis, A. L.
    Galiotou, E.
    Gatos, B.
    Pratikakis, I.
    [J]. INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2011, 14 (02) : 131 - 144
  • [9] Word spotting in historical printed documents using shape and sequence comparisons
    Khurshid, Khurram
    Faure, Claudie
    Vincent, Nicole
    [J]. PATTERN RECOGNITION, 2012, 45 (07) : 2598 - 2609
  • [10] Holistic word recognition for handwritten historical documents
    Lavrenko, V
    Rath, TM
    Manmatha, R
    [J]. FIRST INTERNATIONAL WORKSHOP ON DOCUMENT IMAGE ANALYSIS FOR LIBRARIES, PROCEEDINGS, 2004, : 278 - 287