Keyword spotting in unconstrained handwritten Chinese documents using contextual word model

被引:11
作者
Huang, Liang [1 ]
Yin, Fei [2 ]
Chen, Qing-Hu [1 ]
Liu, Cheng-Lin [2 ]
机构
[1] Wuhan Univ, Sch Elect Informat, Wuhan 430079, Hubei, Peoples R China
[2] Chinese Acad Sci, Inst Automat, NLPR, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Keyword spotting; Chinese handwritten documents; Word similarity; Contextual word model; RETRIEVAL; SHAPE; SEGMENTATION; RECOGNITION; ONLINE;
D O I
10.1016/j.imavis.2013.10.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a method for keyword spotting in off-line Chinese handwritten documents using a contextual word model, which measures the similarity between the query word and every candidate word in the document by combining a character classifier and the geometric context as well as linguistic context. The geometric context model characterizes the single-character likeliness and between-character relationship. The linguistic model utilizes the dependency of the word with the external adjacent characters. The combining weights are optimized on training documents. Experiments on a large handwriting database CASIA-HWDB demonstrate the effectiveness of the proposed method and justify the benefits of geometric and linguistic contexts. Compared to transcription-based text search, the proposed method can provide higher recall rate, and for spotting words of four characters, the proposed method provides both higher precision and recall rate. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:958 / 968
页数:11
相关论文
共 50 条
[21]   Connected Component Based Word Spotting on Persian Handwritten image documents [J].
Mobarakeh, M. Iranpour ;
Yarmohammadi, H. .
INTERNATIONAL JOURNAL OF NONLINEAR ANALYSIS AND APPLICATIONS, 2019, 10 (02) :11-21
[22]   A Convolutional Autoencoder based Keyword Spotting in Historical Handwritten Devanagari Documents [J].
Sushma, S. N. ;
Sharada, B. .
2022 INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES, ICICT 2022, 2022, :356-362
[23]   Filters for graph-based keyword spotting in historical handwritten documents [J].
Stauffer, Michael ;
Fischer, Andreas ;
Riesen, Kaspar .
PATTERN RECOGNITION LETTERS, 2020, 134 :125-134
[24]   Word Spotting in Cursive Handwritten Documents Using Modified Character Shape Codes [J].
Sarkar, Sayantan .
ADVANCES IN COMPUTING AND INFORMATION TECHNOLOGY, VOL 3, 2013, 178 :269-278
[25]   A deep HMM model for multiple keywords spotting in handwritten documents [J].
Thomas, Simon ;
Chatelain, Clement ;
Heutte, Laurent ;
Paquet, Thierry ;
Kessentini, Yousri .
PATTERN ANALYSIS AND APPLICATIONS, 2015, 18 (04) :1003-1015
[26]   Cross-Evaluation of Graph-Based Keyword Spotting in Handwritten Historical Documents [J].
Stauffer, Michael ;
Maergner, Paul ;
Fischer, Andreas ;
Riesen, Kaspar .
GRAPH-BASED REPRESENTATIONS IN PATTERN RECOGNITION, GBRPR 2019, 2019, 11510 :45-55
[27]   Keyword spotting in handwritten documents based on a generic text line HMM and a SVM verification [J].
Kessentini, Yousri ;
Paquet, Thierry .
2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, :41-45
[28]   A Novel Word-Spotting Method for Handwritten Documents Using an Optimization-Based Classifier [J].
Tavoli, Reza ;
Keyvanpour, Mohammadreza .
APPLIED ARTIFICIAL INTELLIGENCE, 2017, 31 (04) :346-375
[29]   Word Spotting as a Service: An Unsupervised and Segmentation-Free Framework for Handwritten Documents [J].
Zagoris, Konstantinos ;
Amanatiadis, Angelos ;
Pratikakis, Ioannis .
JOURNAL OF IMAGING, 2021, 7 (12)
[30]   Assisted transcription of historical documents by keyword spotting: a performance model [J].
Santoro, Adolfo ;
De Stefano, Claudio ;
Marcelli, Angelo .
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, :971-976