A Review of Deep Learning Techniques in Document Image Word Spotting

被引:0
作者
Lalita Kumari
Anuj Sharma
机构
[1] Panjab University,Department of Computer Science and Applications
来源
Archives of Computational Methods in Engineering | 2022年 / 29卷
关键词
Document image; Word spotting; Segmentation; Documents indexing; Convolutional neural network; Deep learning; Feature representation; Query by string; Query by example;
D O I
暂无
中图分类号
学科分类号
摘要
From the early days of pattern recognition, word spotting have been important test beds for studying how well machines can perform better decision making. In recent years, word spotting have made dramatic advances with state-of-the-art techniques reaching high level of performance in real life applications. This word spotting domain have driven research by providing suitable yet well-defined challenges for pattern recognition and document analysis practitioners. We continue in this direction by covering extensive literature and new challenges in this domain with comparison of previous work. In particular, we have covered recent deep learning technique role in word spotting and future scope of word spotting with deep learning. We believe writing suitable review of word spotting will not only be crucial for understanding of this field in today era, but also in broader collaborative efforts, especially those with artificial intelligence based tasks. To facilitate future research in word spotting, we have discussed word spotting from learning environment, including its framework design with components as query phase, preprocessing stages, segmentation, feature extraction, feature representation and matching process strategies. Further, deep learning working and use in word spotting architecture has been discussed. The study also include an experimental comparison for the research community to evaluate algorithmic advances along with benchmarked datasets, and future challenges in this field.
引用
收藏
页码:1085 / 1106
页数:21
相关论文
共 117 条
[1]  
Fischer A(2012)Lexicon-free handwritten word spotting using character HMMs Pattern Recogn. Lett. 33 934-942
[2]  
Keller A(2014)Segmentation-free word spotting with exemplar SVMs Pattern Recog. 47 3967-3978
[3]  
Frinken V(1979)A threshold selection method from gray-level histograms IEEE Trans Syst Man Cybern 9 62-66
[4]  
Bunke H(1982)Block segmentation and text extraction in mixed text/image documents Comput Graphics Image Process 20 375-390
[5]  
Almazán J(2000)Word spotting in bitmapped fax documents Inf Retr 2 207-226
[6]  
Gordo A(2000)A line-oriented approach to word spotting in handwritten documents Pattern Anal Appl 3 153-168
[7]  
Fornés A(2007)Word spotting for historical documents Int J Doc Anal Recog (IJDAR) 9 139-152
[8]  
Valveny E(2007)Word matching using single closed contours for indexing handwritten historical documents Int J Doc Anal Recog (IJDAR) 9 153-165
[9]  
Otsu N(2009)Handwritten word-spotting using hidden markov models and universal vocabularies Pattern Recogn 42 2106-2116
[10]  
Wahl FM(2009)Towards an omnilingual word retrieval system for ancient manuscripts Pattern Recog 42 2089-2105