Deep-learning and graph-based approach to table structure recognition

被引:13
作者
Lee, Eunji [1 ]
Park, Jaewoo [1 ]
Koo, Hyung Il [2 ]
Cho, Nam Ik [1 ,3 ]
机构
[1] Seoul Natl Univ, Dept Elect & Comp Engn, INMC, Seoul 08826, South Korea
[2] Ajou Univ, Dept Elect & Comp Engn, Suwon 16499, South Korea
[3] Seoul Natl Univ, Sch Data Sci, Seoul 08826, South Korea
关键词
Deep learning; Document analysis; Graph-based approach; Table understanding;
D O I
10.1007/s11042-021-11819-7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Table structure recognition is a key component in document understanding. Many prior methods have addressed this problem with three sequential steps: table detection, table component extraction, and structure analysis based on pairwise relations. However, they have limitations in addressing complexly structured tables and/or practical scenarios (e.g., scanned documents). In this paper, we propose a novel graph-based table structure recognition framework. In order to handle complex tables, we formulate tables as planar graphs, whose faces are cell-regions. Then, we compute vertex (junction) confidence maps and line fields with the heatmap regression networks having a small number of parameters (about 1M) and reconstruct tables by solving a constrained optimization problem. We demonstrate the robustness of the proposed system through experiments on ICDAR 2019 dataset and on challenging table images. Experimental results show that the proposed method outperforms the conventional method for a range of scenarios and delivers good generalization performance.
引用
收藏
页码:5827 / 5848
页数:22
相关论文
共 36 条
[31]  
Tensmeyer Chris, 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR). Proceedings, P114, DOI 10.1109/ICDAR.2019.00027
[32]  
Vanhoucke V., 2014, ICLR Invited Talk, V1, P2
[33]   Table structure understanding and its performance evaluation [J].
Wang, YL ;
Phillips, IT ;
Haralick, RM .
PATTERN RECOGNITION, 2004, 37 (07) :1479-1497
[34]   A survey of table recognition: Models, observations, transformations, and inferences [J].
Zanibbi R. ;
Blostein D. ;
Cordy J.R. .
Document Analysis and Recognition, 2004, 7 (1) :1-16
[35]   Global Table Extractor (GTE): A Framework for Joint Table Identification and Cell Structure Recognition Using Visual Context [J].
Zheng, Xinyi ;
Burdick, Douglas ;
Popa, Lucian ;
Zhong, Xu ;
Wang, Nancy Xin Ru .
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, :697-706
[36]  
Zhong Xu, 2019, ARXIV191110683