Multimodal weighted graph representation for information extraction from visually rich documents

被引：2

作者：

Gbada, Hamza ^{[1
,2
]}

Kalti, Karim ^{[2
,3
]}

Mahjoub, Mohamed Ali ^{[2
]}

机构：

[1] Univ Sousse, Higher Inst Informat & Commun Technol, Sousse, Tunisia

[2] Natl Engn Sch Sousse ENISo, Lab Adv Technol & Intelligent Syst LATIS, Sousse, Tunisia

[3] Univ Monastir, Fac Sci Monastir, Monastir, Tunisia

来源：

NEUROCOMPUTING | 2024年 / 573卷

关键词：

Information extraction; Visually rich documents; Graph convolutional net works;

D O I：

10.1016/j.neucom.2023.127223

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper introduces a novel system for information extraction from visually rich documents (VRD) using a weighted graph representation. The proposed method aims to improve the performance of the information extraction task by capturing the relationships between various VRD components. The VRD is modeled as a weighted graph, in which visual, textual, and spatial features of text regions are encoded in nodes and edges representing the relationships between neighboring text regions. The information extraction task from VRD is performed as a node classification task through the use of a graph convolutional networks, where the VRD graphs are fed into the network. The approach is evaluated across diverse documents, encompassing invoices and receipts, revealing achievement levels equal to or surpassing robust baselines.

引用

页数：9

共 49 条

[1] Information Extraction from Visually Rich Documents Using Directed Weighted Graph Neural Network
Gbada, Hamza
Kalti, Karim
Mahjoub, Mohamed Ali
DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT VI, 2024, 14809 : 248 - 263
[2] Information Extraction from Text Intensive and Visually Rich Banking Documents
Oral, Berke
Emekligil, Erdem
Arslan, Secil
Eryigit, Gulsen
INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (06)
[3] Visual Segmentation for Information Extraction from Heterogeneous Visually Rich Documents
Sarkhel, Ritesh
Nandi, Arnab
SIGMOD '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2019, : 247 - 262
[4] A Span Extraction Approach for Information Extraction on Visually-Rich Documents
Nguyen, Tuan-Anh D.
Vu, Hieu M.
Nguyen Hong Son
Minh-Tien Nguyen
DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT II, 2021, 12917 : 353 - 363
[5] Deep learning approaches for information extraction from visually rich documents: datasets, challenges and methods
Gbada, Hamza
Kalti, Karim
Mahjoub, Mohamed Ali
INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2024, : 121 - 142
[6] VisualWordGrid: Information Extraction from Scanned Documents Using a Multimodal Approach
Kerroumi, Mohamed
Sayem, Othmane
Shabou, Aymen
DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT II, 2021, 12917 : 389 - 402
[7] Fusion of visual representations for multimodal information extraction from unstructured transactional documents
Berke Oral
Gülşen Eryiğit
International Journal on Document Analysis and Recognition (IJDAR), 2022, 25 : 187 - 205
[8] Fusion of visual representations for multimodal information extraction from unstructured transactional documents
Oral, Berke
Eryigit, Gulsen
INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2022, 25 (3) : 187 - 205
[9] Information Extraction and Graph Representation for the Design of Formulated Products
Sunkle, Sagar
Saxena, Krati
Patil, Ashwini
Kulkarni, Vinay
Jain, Deepak
Chacko, Rinu
Rai, Beena
ADVANCED INFORMATION SYSTEMS ENGINEERING, CAISE 2020, 2020, 12127 : 433 - 448
[10] VRDSynth: Synthesizing Programs for Multilingual Visually Rich Document Information Extraction
Thanh-Dat Nguyen
Tung Do-Viet
Hung Nguyen-Duy
Tuan-Hai Luu
Le, Hung
Le, Bach
Thongtanunam, Patanamon
PROCEEDINGS OF THE 33RD ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2024, 2024, : 704 - 716

← 1 2 3 4 5 →