Table Detection Method Based on Faster-RCNN and Window Attention

被引：0

作者：

Chen, Han ^{[1
]}

Song, Shengli ^{[1
]}

Su, Rijian ^{[2
]}

机构：

[1] Zhengzhou Univ Light Ind, Sch Software, Zhengzhou, Henan, Peoples R China

[2] Zhengzhou Univ Light Ind, Sch Comp Sci & Technol, Zhengzhou, Henan, Peoples R China

来源：

PROCEEDINGS OF 2023 THE 12TH INTERNATIONAL CONFERENCE ON NETWORKS, COMMUNICATION AND COMPUTING, ICNCC 2023 | 2023年

关键词：

Self-attention; Table detection; Inverted residual feed-forward network;

D O I：

10.1145/3638837.3638879

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As an important carrier of information, tables possess the characteristics of high data storage density, conciseness, and intuitiveness, and are widely applied in offices and daily life. Due to the complexity of table structures and diverse presentation formats, the automated processing of a large number of image-based tables has always been a challenge in the field of document recognition. This algorithm addresses the task of table detection in table processing and proposes a table detection algorithm based on an improved window self-attention network for feature extraction of image-based tables. It utilizes a two-stage object detection algorithm, introduces local feature extraction blocks and backward feed-forward residual network blocks, and designs a feature pyramid network within the backbone to enhance the model's detection performance by improving its ability to learn document spatial layout features. The effectiveness of the proposed method is verified through experimental comparisons on publicly available datasets.

引用

页码：267 / 273

页数：7

共 14 条

[1]

Fernandes J, 2022, Neurocomputing, P468

[2] CMT: Convolutional Neural Networks Meet Vision Transformers [J].

Guo, Jianyuan ;

Han, Kai ;

Wu, Han ;

Tang, Yehui ;

Chen, Xinghao ;

Wang, Yunhe ;

Xu, Chang .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :12165-12175

[3] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[4] A Saliency-Based Convolutional Neural Network for Table and Chart Detection in Digitized Documents [J].

Kavasidis, I ;

Pino, C. ;

Palazzo, S. ;

Rundo, F. ;

Giordano, D. ;

Messina, P. ;

Spampinato, C. .

IMAGE ANALYSIS AND PROCESSING - ICIAP 2019, PT II, 2019, 11752 :292-302

[5]

Li MH, 2020, Arxiv, DOI arXiv:1903.01949

[6] Swin Transformer: Hierarchical Vision Transformer using Shifted Windows [J].

Liu, Ze ;

Lin, Yutong ;

Cao, Yue ;

Hu, Han ;

Wei, Yixuan ;

Zhang, Zheng ;

Lin, Stephen ;

Guo, Baining .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9992-10002

[7] On the Integration of Self-Attention and Convolution [J].

Pan, Xuran ;

Ge, Chunjiang ;

Lu, Rui ;

Song, Shiji ;

Chen, Guanfu ;

Huang, Zeyi ;

Huang, Gao .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :805-815

[8] FPSiamRPN: Feature Pyramid Siamese Network With Region Proposal Network for Target Tracking [J].

Rao, Yunbo ;

Cheng, Yiming ;

Xue, Junmin ;

Pu, Jiansu ;

Wang, Qiujie ;

Jin, Rize ;

Wang, Qifei .

IEEE ACCESS, 2020, 8 :176158-176169

[9] MobileNetV2: Inverted Residuals and Linear Bottlenecks [J].

Sandler, Mark ;

Howard, Andrew ;

Zhu, Menglong ;

Zhmoginov, Andrey ;

Chen, Liang-Chieh .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4510-4520

[10] DeepDeSRT: Deep Learning for Detection and Structure Recognition of Tables in Document Images [J].

Schreiber, Sebastian ;

Agne, Stefan ;

Wolf, Ivo ;

Dengel, Andreas ;

Ahmed, Sheraz .

2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, :1162-1167

← 1 2 →