Connected Component Based Word Spotting on Persian Handwritten image documents

被引:0
作者
Mobarakeh, M. Iranpour [1 ]
Yarmohammadi, H. [1 ]
机构
[1] Payam Noor Univ, Dept Comp Engn & IT, Tehran, Iran
来源
INTERNATIONAL JOURNAL OF NONLINEAR ANALYSIS AND APPLICATIONS | 2019年 / 10卷 / 02期
关键词
Persian handwritten documents; connected component; attribute-based classification; label embedding; RETRIEVAL; SHAPE;
D O I
10.22075/IJNAA.2019.4125
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Word spotting is to make searchable unindexed image documents by locating word/words in a document image, given a query word. This problem is challenging, mainly due to the large number of word classes with very small inter-class and substantial intra-class distances. In this paper, a segmentation-based word spotting method is presented for multi-writer Persian handwritten documents using attribute-based classification and label-embedding. For this purpose, a hierarchical framework is proposed, in which at first, the candidate are selected based on connected components(CCs) sequence. Then, the query word is segmented to constructor CCs, and similar CCs count in the candidate region of document are selected based on their distances to the CCs count of the query word. As a result, the candidate regions are extracted. In the final phase, the query word is located only in the candidate regions of the document. A well known Persian handwritten text dataset, namely FTH, is chosen as a benchmark for the presented method. The results shows that the proposed method outperforms the state-of-the-art methods, 81.02 percent for unseen word class retrieval.
引用
收藏
页码:11 / 21
页数:11
相关论文
共 54 条
[1]  
Almazn J., 2012, EFFICIENT EXEMPLAR W
[2]  
Almazn J., 2013, ICCV 2013 IEEE INT C
[3]  
Almazn J., 2014, LEARNING REPRESENT H
[4]  
Almazn J., 2014, WORD SPOTTING RECOGN
[5]  
[Anonymous], 2006, 10 INT WORKSH FRONT
[6]  
[Anonymous], ARXIV160400187
[7]  
Ball S.N.S.a.G.R., 2008, LANGUAGE INDEPENDENT
[8]  
boroumand S., 2017, HANDWRITTEN WORD REC
[9]  
Brik Y, 2013, INT SYMP IMAGE SIG, P194
[10]  
Chan J., 2006, P IEEE C COMPUTER VI, P1455, DOI DOI 10.1109/CVPR.2006.269