A large-scale dataset for end-to-end table recognition in the wild

被引:7
|
作者
Yang, Fan [1 ]
Hu, Lei [1 ]
Liu, Xinwu [2 ]
Huang, Shuangping [1 ,3 ]
Gu, Zhenghui [4 ]
机构
[1] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 510641, Peoples R China
[2] Zhuzhou CRRC Times Elect Co Ltd, Zhuzhou 412001, Peoples R China
[3] Pazhou Lab, Guangzhou 510335, Peoples R China
[4] South China Univ Technol, Coll Automat Sci & Engn, Guangzhou 510641, Peoples R China
关键词
D O I
10.1038/s41597-023-01985-8
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Table recognition (TR) is one of the research hotspots in pattern recognition, which aims to extract information from tables in an image. Common table recognition tasks include table detection (TD), table structure recognition (TSR) and table content recognition (TCR). TD is to locate tables in the image, TCR recognizes text content, and TSR recognizes spatial & ontology (logical) structure. Currently, the end-to-end TR in real scenarios, accomplishing the three sub-tasks simultaneously, is yet an unexplored research area. One major factor that inhibits researchers is the lack of a benchmark dataset. To this end, we propose a new large-scale dataset named Table Recognition Set (TabRecSet) with diverse table forms sourcing from multiple scenarios in the wild, providing complete annotation dedicated to end-to-end TR research. It is the largest and first bi-lingual dataset for end-to-end TR, with 38.1 K tables in which 20.4 K are in English and 17.7 K are in Chinese. The samples have diverse forms, such as the border-complete and -incomplete table, regular and irregular table (rotated, distorted, etc.). The scenarios are multiple in the wild, varying from scanned to camera-taken images, documents to Excel tables, educational test papers to financial invoices. The annotations are complete, consisting of the table body spatial annotation, cell spatial & logical annotation and text content for TD, TSR and TCR, respectively. The spatial annotation utilizes the polygon instead of the bounding box or quadrilateral adopted by most datasets. The polygon spatial annotation is more suitable for irregular tables that are common in wild scenarios. Additionally, we propose a visualized and interactive annotation tool named TableMe to improve the efficiency and quality of table annotation.
引用
收藏
页数:14
相关论文
共 50 条
  • [11] Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
    Xue, Jian
    Wang, Peidong
    Li, Jinyu
    Post, Matt
    Gaur, Yashesh
    INTERSPEECH 2022, 2022, : 3263 - 3267
  • [12] End-to-End Feasible Optimization Proxies for Large-Scale Economic Dispatch
    Chen, Wenbo
    Tanneau, Mathieu
    Van Hentenryck, Pascal
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2024, 39 (02) : 4723 - 4734
  • [13] End-to-End Lip-Reading Without Large-Scale Data
    Fernandez-Lopez, Adriana
    Sukno, Federico M.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2076 - 2090
  • [14] Eigen: End-to-end Resource Optimization for Large-Scale Databases on the Cloud
    Li, Ji You
    Zhang, Jiachi
    Zhou, Wenchao
    Liu, Yuhang
    Zhang, Shuai
    Xue, Zhuoming
    Xu, Ding
    Fan, Hua
    Zhou, Fangyuan
    Li, Feifei
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 16 (12): : 3795 - 3807
  • [15] End-to-End Support for Joins in Large-Scale Publish/Subscribe Systems
    Chandramouli, Badrish
    Yang, Jun
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (01): : 434 - 450
  • [16] END-TO-END TRAINING OF A LARGE VOCABULARY END-TO-END SPEECH RECOGNITION SYSTEM
    Kim, Chanwoo
    Kim, Sungsoo
    Kim, Kwangyoun
    Kumar, Mehul
    Kim, Jiyeon
    Lee, Kyungmin
    Han, Changwoo
    Garg, Abhinav
    Kim, Eunhyang
    Shin, Minkyoo
    Singh, Shatrughan
    Heck, Larry
    Gowda, Dhananjaya
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 562 - 569
  • [17] A large-scale, passive analysis of end-to-end TCP performance over GPRS
    Benko, P
    Malicsko, G
    Veres, A
    IEEE INFOCOM 2004: THE CONFERENCE ON COMPUTER COMMUNICATIONS, VOLS 1-4, PROCEEDINGS, 2004, : 1882 - 1892
  • [18] End-to-end Learning of Driving Models from Large-scale Video Datasets
    Xu, Huazhe
    Gao, Yang
    Yu, Fisher
    Darrell, Trevor
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3530 - 3538
  • [19] Vigil: Effective End-to-end Monitoring for Large-scale Recommender Systems at Glance
    Saxena, Priyansh
    Manisha, R.
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 5249 - 5250
  • [20] End-to-end table structure recognition and extraction in heterogeneous documents
    Kashinath, Tejas
    Jain, Twisha
    Agrawal, Yash
    Anand, Tanvi
    Singh, Sanjay
    APPLIED SOFT COMPUTING, 2022, 123