Automatic extraction of table metadata from digital documents

被引:0
作者
Liu, Ying [1 ]
Mitra, Prasenjit [1 ]
Giles, C. Lee [1 ]
Bai, Kun [1 ]
机构
[1] Penn State Univ, Coll Informat Sci & Technol, University Pk, PA 16802 USA
来源
OPENING INFORMATION HORIZONS | 2006年
关键词
metadata extraction; table detection; table structure recognition; searching; exchanging;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tables are used to present, list, summarize, and structure important data in documents. In scholarly articles, they are often used to present the relationships among data and highlight a collection of results obtained from experiments and scientific analysis. In digital libraries, extracting this data automatically and understanding the structure and content of tables are very important to many applications. Automatic identification extraction, and search for the contents of tables can be made more precise with the help of metadata. In this paper, we propose a set of medium-independent table metadata to facilitate the table indexing, searching, and exchanging. To extract the contents of tables and their metadata, an automatic table metadata extraction algorithm is designed and rested on PDF documents.
引用
收藏
页码:339 / +
页数:2
相关论文
共 50 条
  • [21] An Assistant to Populate Repositories: Gathering Educational Digital Objects and Metadata Extraction
    Casali, Ana
    Deco, Claudia
    Beltramone, Santiago
    IEEE REVISTA IBEROAMERICANA DE TECNOLOGIAS DEL APRENDIZAJE-IEEE RITA, 2016, 11 (02): : 87 - 94
  • [22] A Hybrid Case-based and Rule-based for Metadata Extraction on Heterogeneous Thai Documents
    Khankasikam, Krisda
    2010 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND AUTOMATION ENGINEERING (ICCAE 2010), VOL 1, 2010, : 312 - 317
  • [23] An automatic selective PDF table-extraction method for collecting materials data from literature
    Deng, Jianxin
    Liu, Gang
    Tang, Rui
    Wu, Xiusong
    Yin, Zheng
    ADVANCES IN ENGINEERING SOFTWARE, 2025, 204
  • [24] An Efficient Framework for Algorithmic Metadata Extraction over Scholarly Documents Using Deep Neural Networks
    Raghavendra Nayaka P.
    Ranjan R.
    SN Computer Science, 4 (4)
  • [25] Automatic General Metadata Extraction and Mapping in an HDF5 Use-case
    Heinrichs, Benedikt
    Preuss, Nils
    Politze, Marius
    Mueller, Matthias S.
    Pelz, Peter F.
    PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (KDIR), VOL 1:, 2021, : 172 - 179
  • [26] Annotated Open Corpus Construction and BERT-Based Approach for Automatic Metadata Extraction From Korean Academic Papers
    Kong, Hyesoo
    Yoon, Hwamook
    Seol, Jaewook
    Hyun, Mihwan
    Lee, Hyejin
    Kim, Soonyoung
    Choi, Wonjun
    IEEE ACCESS, 2023, 11 : 825 - 838
  • [27] Reference Metadata Extraction from Korean Research Papers
    Seol, Jae-Wook
    Choi, Won-Jun
    Jeong, Hee-Seok
    Hwang, Hye-Kyong
    Yoon, Hwa-Mook
    MINING INTELLIGENCE AND KNOWLEDGE EXPLORATION, MIKE 2018, 2018, 11308 : 42 - 52
  • [28] Table understanding in structured documents
    Holecek, Martin
    Hoskovec, Antonin
    Baudis, Petr
    Klinger, Pavel
    2019 INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION WORKSHOPS (ICDARW), VOL 5, 2019, : 158 - 164
  • [29] Metadata Extraction with Cue Model
    Isa, Wan Malini Wan
    Hamid, Jamaliah Abdul
    Ibrahim, Hamidah
    Abdullah, Rusli
    Selamat, Mohd. Hasan
    Abdullah, Muhamad Taufik
    Nasharuddin, Nurul Amelina
    KMICE 2008 - KNOWLEDGE MANAGEMENT INTERNATIONAL CONFERENCE, 2008 - TRANSFERRING, MANAGING AND MAINTAINING KNOWLEDGE FOR NATION CAPACITY DEVELOPMENT, 2008, : 107 - 111
  • [30] Empirical Analysis of Semantic Metadata Extraction from Video Lecture Subtitles
    dos Reis, Julio Cesar
    Borges, Marcos Vinicius Macedo
    Gribeler, Guilherme Pereira
    2019 IEEE 28TH INTERNATIONAL CONFERENCE ON ENABLING TECHNOLOGIES: INFRASTRUCTURE FOR COLLABORATIVE ENTERPRISES (WETICE), 2019, : 301 - 306