WSNet - Convolutional Neural Network-based Word Spotting for Arabic and English Handwritten Documents

被引:1
作者
Mohammed, Hanadi Hassen [1 ]
Subramanian, Nandhini [1 ]
Al-Maadeed, Somaya [1 ]
Bouridane, Ahmed [2 ]
机构
[1] Qatar Univ Al Jamiaa St, Coll Engn, Dept Comp Sci & Engn, Doha, Qatar
[2] Univ Sharjah, Ctr Data Analyt & Cybersecur, Sharjah, U Arab Emirates
来源
TEM JOURNAL-TECHNOLOGY EDUCATION MANAGEMENT INFORMATICS | 2022年 / 11卷 / 01期
关键词
Word spotting; Deep learning; Word recognition; Arabic word spotting;
D O I
10.18421/TEM111-33
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a new convolutional neural network architecture to tackle the problem of word spotting in handwritten documents. A Deep learning approach using a novel Convolutional Neural Network is developed for the recognition of the words in historical handwritten documents. This includes a pre-processing step to re-size all the images to a fixed size. These images are then fed to the CNN for training. The proposed network shows promising results for both Arabic and English and both modern and historical documents. Four datasets - IFN/ENIT, Visual Media Lab - Historical Documents (VML-HD), George Washington and IAM datasets - have been used for evaluation. It is observed that the mean average precision for the George Washington dataset is 99.6%, outperforming other state-of-the-art methods. Historical documents in Arabic are known for being complex to work with; this model shows good results for the Arabic datasets, as well. This indicates that the architecture is also able to generalize well to other languages.
引用
收藏
页码:264 / 271
页数:8
相关论文
共 23 条
  • [1] Improvements in Sub-Character HMM Model Based Arabic Text Recognition
    Ahmad, Irfan
    Fink, Gernot A.
    Mahmoud, Sabri A.
    [J]. 2014 14TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2014, : 537 - 542
  • [2] An effective approach to offline Arabic handwriting recognition
    Al Abodi, Jafaar
    Li, Xue
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2014, 40 (06) : 1883 - 1901
  • [3] Effective technique for the recognition of offline Arabic handwritten words using hidden Markov models
    Azeem, Sherif Abdel
    Ahmed, Hany
    [J]. INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2013, 16 (04) : 399 - 412
  • [4] Word Spotting Using Convolutional Siamese Network
    Barakat, Berat Kurar
    Alasam, Reem
    El-Sana, Jihad
    [J]. 2018 13TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS (DAS), 2018, : 229 - 234
  • [5] Local Binary Pattern for Word Spotting in Handwritten Historical Document
    Dey, Sounak
    Nicolaou, Anguelos
    Llados, Josep
    Pal, Umapada
    [J]. STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2016, 2016, 10029 : 574 - 583
  • [6] Dreuw Philippe, 2011, 2011 18th IEEE International Conference on Image Processing (ICIP 2011), P3541, DOI 10.1109/ICIP.2011.6116480
  • [7] Improving CNN-RNN Hybrid Networks for Handwriting Recognition
    Dutta, Kartik
    Krishnan, Praveen
    Mathew, Minesh
    Jawahar, C. V.
    [J]. PROCEEDINGS 2018 16TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2018, : 80 - 85
  • [8] Fink, 2018, ARXIV PREPRINT ARXIV
  • [9] Fischer A., 2010, Proceedings of the 2010 20th International Conference on Pattern Recognition (ICPR 2010), P3416, DOI 10.1109/ICPR.2010.834
  • [10] Improving HMM-Based Keyword Spotting with Character Language Models
    Fischer, Andreas
    Frinken, Volkmar
    Bunke, Horst
    Suen, Ching Y.
    [J]. 2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 506 - 510