OCR-independent and Segmentation-free Word-Spotting in Handwritten Arabic Archive Documents

被引:0
|
作者
Aouadi, N. [1 ]
Kacem, A. [1 ]
机构
[1] LaTICE, Res Lab Technol Informat & Commun & Elect Engn, Tunis, Tunisia
来源
2013 INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND SOFTWARE APPLICATIONS (ICEESA) | 2013年
关键词
OCR; Word-spotting; Generalized Hough Transform; Clustering; Handwritten Recognition; Historical document; RETRIEVAL;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, a word-spotting approach is presented that can help in reading handwritten Arabic Archive Documents. Because of the low quality of these documents, the proposed approach is free segmentation, independent of OCR, using a global transformation of word images. It is a based learning approach which employs Generalized Hough Transform (GHT) technique. It detects words, described by their models, in documents images by finding the model's position in the image. With the GHT, the problem of finding the model's position is transformed to a problem of finding the transformation's parameter that maps the model into the image. Parameters such as Hough threshold and distance between voting points are considered for a better location and recognition of words. We tested our system on registers from the 19th century onwards, held in the National Archives of Tunisia. Our first experiments reach an average of 94% of well-spotted words.
引用
收藏
页码:36 / 41
页数:6
相关论文
共 50 条
  • [31] Browsing Heterogeneous Document Collections by a Segmentation-free Word Spotting Method
    Rusinol, Marcal
    Aldavert, David
    Toledo, Ricardo
    Llados, Josep
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 63 - 67
  • [32] R-PHOC: Segmentation-Free Word Spotting using CNN
    Ghosh, Suman K.
    Valveny, Ernest
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 801 - 806
  • [33] Segmentation and Word Spotting Methods for Printed and Handwritten Arabic Texts: A Comparative Study
    Kchaou, Mariem Gargouri
    Kanoun, Slim
    Ogier, Jean-Marc
    13TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2012), 2012, : 274 - 279
  • [34] Word Stretching for Effective Segmentation and Classification of Historical Arabic Handwritten Documents
    Al Aghbari, Zaher
    Brook, Salama
    RCIS 2009: PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON RESEARCH CHALLENGES IN INFORMATION SCIENCE, 2009, : 217 - 224
  • [35] An Application-Independent and Segmentation-Free Approach for Spotting Queries in Document Images
    Chatbri, Houssem
    Kwan, Paul
    Kameyama, Keisuke
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 2891 - 2896
  • [36] Online Handwritten Cursive Word Recognition Using Segmentation-free and Segmentation-based Methods
    Zhu, Bilan
    Shivram, Arti
    Govindaraju, Venu
    Nakagawa, Masaki
    Proceedings 3rd IAPR Asian Conference on Pattern Recognition ACPR 2015, 2015, : 161 - 165
  • [37] A Segmentation-Free Approach to Strokes Extraction from Online Isolated Arabic Handwritten Character
    Nakkach, Houda
    Hichri, Soumaya
    Haboubi, Sofiene
    Amiri, Hamid
    2016 2ND INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP), 2016, : 431 - 436
  • [38] WSNet - Convolutional Neural Network-based Word Spotting for Arabic and English Handwritten Documents
    Mohammed, Hanadi Hassen
    Subramanian, Nandhini
    Al-Maadeed, Somaya
    Bouridane, Ahmed
    TEM JOURNAL-TECHNOLOGY EDUCATION MANAGEMENT INFORMATICS, 2022, 11 (01): : 264 - 271
  • [39] Script Independent Word Spotting in Offline Handwritten Documents Based on Hidden Markov Models
    Wshah, Safwan
    Kumar, Gaurav
    Govindaraju, Venu
    13TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2012), 2012, : 14 - 19
  • [40] An OCR Free Method for Word Spotting in Printed Documents: the Evaluation of Different Feature Sets
    Rios, Israel
    Britto, Alceu de Souza, Jr.
    Koerich, Alessandro Lameiras
    Soares Oliveira, Luis Eduardo
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2011, 17 (01) : 48 - 63