Document analysis systems that improve with use

被引:1
作者
Nagy, George [1 ]
机构
[1] Rensselaer Polytech Inst, Troy, NY 12180 USA
关键词
Interactive document analysis; Adaptive classification; Style-constrained recognition; Camera-based OCR; Memex; Lifetime reader; RECOGNIZE PATTERNS; FONT RECOGNITION; CLASSIFICATION; CONTEXT; EQUALIZATION; MACHINE; MODELS; OCR;
D O I
10.1007/s10032-019-00344-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Document analysis tasks for which representative labeled training samples are available have been largely solved. The next frontier is coping with hitherto unseen formats, unusual typefaces, idiosyncratic handwriting and imperfect image acquisition. Adaptive and style-constrained classification methods can overcome some expected variability, but human intervention will remain necessary in many tasks. Interactive pattern recognition includes data exploration and active learning as well as access to stored documents. The principle of "green interaction" is to make use of every intervention to reduce the likelihood that the automated system will make the same mistake again and again. Some of these techniques may pop up in forthcoming personal camera-based memex-like applications that will have a far broader range of input documents and scene text than the current, successful but highly specialized, systems for patents, postal addresses, bank checks and books.
引用
收藏
页码:13 / 29
页数:17
相关论文
共 119 条
[1]   Automatic Ground Truth Generation of Camera Captured Documents Using Document Image Retrieval [J].
Ahmed, Sheraz ;
Kise, Koichi ;
Iwamura, Masakazu ;
Liwicki, Marcus ;
Dengel, Andreas .
2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, :528-532
[2]  
[Anonymous], P SPIE EIT DRR
[3]  
[Anonymous], P DOC REC RETR IS T
[4]  
[Anonymous], IJDAR
[5]  
[Anonymous], DATA COMPLEXITY PATT
[6]  
[Anonymous], 61038 STANF EL LAB
[7]  
[Anonymous], 2005 IEEE INT C EL T
[8]  
[Anonymous], PATTERN RECOGNITION
[9]  
[Anonymous], P ICDAR 11 BEIJ
[10]  
[Anonymous], P DOC AN SYST NAR JA