Information Extraction from Scanned Invoice Documents Using Deep Learning Methods

被引：0

作者：

Avci, Ufuk Ilke ^{[1
]}

Goularas, Dionysis ^{[1
]}

Korkmaz, Emin Erkan ^{[1
]}

Deveci, Baris ^{[2
]}

机构：

[1] Yeditepe Univ, Dept Comp Engn, Istanbul, Turkiye

[2] Intecon Informat & Concultancy, Istanbul, Turkiye

来源：

2024 IEEE THIRTEENTH INTERNATIONAL CONFERENCE ON IMAGE PROCESSING THEORY, TOOLS AND APPLICATIONS, IPTA 2024 | 2024年

关键词：

Transformers; Graph Convolutional Networks; Key Information Extraction; LayoutLM;

D O I：

10.1109/IPTA62886.2024.10755641

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we explore innovative approaches in the field of information extraction from scanned invoice documents using deep learning methods. Our study makes significant contributions in three key areas. Firstly, we introduce a novel organizational method for labeling invoices, designed to enhance the efficiency and accuracy of data extraction. This lays a foundation for future research in this domain. Secondly, we break new ground by classifying a larger number of classes, 57 in total, far exceeding the typical 8-10 classes usually addressed in existing literature. This comprehensive classification enables a more detailed and nuanced understanding of invoice data. Lastly, we present our experimentation with various deep learning architectures, including Graph Convolutional Network (GCN), LayoutLMv1, and LayoutLMv3. Notably, our findings reveal promising, albeit preliminary results for Graph Convolutional Networks (GCN), an architecture that is not pre-trained, suggesting potential for further exploration and development in this area.

引用

页数：6

共 23 条

[1]

Aslan Enes, 2016, VISIGRAPP 2016. 11th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. Proceedings: VISAPP 2016, P392

[2]

Chen Ming., 2020, INT C MACHINE LEARNI, P1725

[3]

Goodfellow I, 2016, ADAPT COMPUT MACH LE, P1

[4]

Guo H., 2019, P INT C DOCUMENT ANA, P254, DOI [10.1109/ICDAR.2019.00049］, DOI 10.1109/ICDAR.2019.00049]

[5] LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking [J].

Huang, Yupan ;

Lv, Tengchao ;

Cui, Lei ;

Lu, Yutong ;

Wei, Furu .

PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, :4083-4091

[6]

Hwang Wonseok., 2020, arXiv

[7] Deep learning [J].

LeCun, Yann ;

Bengio, Yoshua ;

Hinton, Geoffrey .

NATURE, 2015, 521 (7553) :436-444

[8]

Li GH, 2020, Arxiv, DOI arXiv:2006.07739

[9]

Kipf TN, 2017, Arxiv, DOI arXiv:1609.02907

[10]

Palm Rasmus Berg, 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR). Proceedings, P329, DOI 10.1109/ICDAR.2019.00060

← 1 2 3 →