Classification of incunable glyphs and out-of-distribution detection with joint energy-based models

被引：6

作者：

Kordon, Florian ^{[1
]}

Weichselbaumer, Nikolaus ^{[2
]}

Herz, Randall ^{[2
]}

Mossman, Stephen ^{[3
]}

Potten, Edward ^{[4
]}

Seuret, Mathias ^{[1
]}

Mayr, Martin ^{[1
]}

Christlein, Vincent ^{[1
]}

机构：

[1] Friedrich Alexander Univ Erlangen Nurnberg, Pattern Recognit Lab, Martensstr 3, D-91058 Erlangen, Germany

[2] Johannes Gutenberg Univ Mainz, Gutenberg Inst Weltliteratur & schriftorientierte, Jakob Welder Weg 18, D-55128 Mainz, Germany

[3] Univ Manchester, Sch Arts Languages & Cultures, Oxford Rd, Manchester M13 9PL, England

[4] Univ York, Ctr Medieval Studies, York YO1 7EP, England

来源：

INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION | 2023年 / 26卷 / 03期

基金：

英国艺术与人文研究理事会;

关键词：

Letterpress printing; Glyph extraction; Optical character recognition; Joint energy-based models; OOD detection; NETWORKS; PRODUCTS;

D O I：

10.1007/s10032-023-00442-x

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Optical character recognition (OCR) has proved a powerful tool for the digital analysis of printed historical documents. However, its ability to localize and identify individual glyphs is challenged by the tremendous variety in historical type design, the physicality of the printing process, and the state of conservation. We propose to mitigate these problems by a downstream fine-tuning step that corrects for pathological and undesirable extraction results. We implement this idea by using a joint energy-based model which classifies individual glyphs and simultaneously prunes potential out-of-distribution (OOD) samples like rubrications, initials, or ligatures. During model training, we introduce specific margins in the energy spectrum that aid this separation and explore the glyph distribution's typical set to stabilize the optimization procedure. We observe strong classification at 0.972 AUPRC across 42 lower- and uppercase glyph types on a challenging digital reproduction of Johannes Balbus' Catholicon, matching the performance of purely discriminative methods. At the same time, we achieve OOD detection rates of 0.989 AUPRC and 0.946 AUPRC for OOD 'clutter' and 'ligatures' which substantially improves upon recently proposed OOD detection techniques. The proposed approach can be easily integrated into the postprocessing phase of current OCR to aid reproduction and shape analysis research.

引用

页码：223 / 240

页数：18

共 73 条

[1] ACKLEY DH, 1985, COGNITIVE SCI, V9, P147
[2] Nguyen A, 2015, PROC CVPR IEEE, P427, DOI 10.1109/CVPR.2015.7298640
[3] Arbel M., 2020, ARXIV
[4] Betancourt M., 2017, CONCEPTUAL INTRO HAM
[5] Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models
Bond-Taylor, Sam
Leach, Adam
Long, Yang
Willcocks, Chris G.
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (11) : 7327 - 7347
[6] Brosse N., 2018, ADV NEURAL INFORM PR, P8278
[7] Glyph Miner: A System for Efficiently Extracting Glyphs from Early Prints in the Context of OCR
Budig, Benedikt
van Dijk, Thomas C.
Kirchner, Felix
[J]. 2016 IEEE/ACM JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL), 2016, : 31 - 34
[8] Budig Benedikt., 2018, Extracting Spatial Information from Historical Maps: Algorithms and Interaction
[9] CorDeep and the Sacrobosco Dataset: Detection of Visual Elements in Historical Documents
Buettner, Jochen
Martinetz, Julius
El-Hajj, Hassan
Valleriani, Matteo
[J]. JOURNAL OF IMAGING, 2022, 8 (10)
[10] Caluori U., 2013, INT C SOFTW TEL COMP, P1

← 1 2 3 4 5 6 7 8 →