Automatic extraction of table metadata from digital documents

被引:0
作者
Liu, Ying [1 ]
Mitra, Prasenjit [1 ]
Giles, C. Lee [1 ]
Bai, Kun [1 ]
机构
[1] Penn State Univ, Coll Informat Sci & Technol, University Pk, PA 16802 USA
来源
OPENING INFORMATION HORIZONS | 2006年
关键词
metadata extraction; table detection; table structure recognition; searching; exchanging;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tables are used to present, list, summarize, and structure important data in documents. In scholarly articles, they are often used to present the relationships among data and highlight a collection of results obtained from experiments and scientific analysis. In digital libraries, extracting this data automatically and understanding the structure and content of tables are very important to many applications. Automatic identification extraction, and search for the contents of tables can be made more precise with the help of metadata. In this paper, we propose a set of medium-independent table metadata to facilitate the table indexing, searching, and exchanging. To extract the contents of tables and their metadata, an automatic table metadata extraction algorithm is designed and rested on PDF documents.
引用
收藏
页码:339 / +
页数:2
相关论文
共 50 条
  • [31] MexPub: Deep Transfer Learning for Metadata Extraction from German Publications
    Boukhers, Zeyd
    Beili, Nada
    Hartmann, Timo
    Goswami, Prantik
    Zafar, Muhammad Arslan
    2021 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL 2021), 2021, : 250 - 253
  • [32] A trigram hidden Markov model for metadata extraction from heterogeneous references
    Ojokoh, Bolanle
    Zhang, Ming
    Tang, Jian
    INFORMATION SCIENCES, 2011, 181 (09) : 1538 - 1551
  • [33] A metadata extraction approach from papers based on meta-learning
    Zhang, F. (xjzfz@ysu.edu.cn), 1600, Binary Information Press, Flat F 8th Floor, Block 3, Tanner Garden, 18 Tanner Road, Hong Kong (10): : 1121 - 1129
  • [34] Metadata Extraction System for Chinese Books
    Gao, Liangcai
    Zhong, Yuan
    Tang, Yingmin
    Tang, Zhi
    Lin, Xiaofan
    Hu, Xuan
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 749 - 753
  • [35] Automated metadata extraction: challenges and opportunities
    Skluzacek, Tyler J.
    Chard, Kyle
    Foster, Ian
    2022 IEEE 18TH INTERNATIONAL CONFERENCE ON E-SCIENCE (ESCIENCE 2022), 2022, : 495 - 500
  • [36] A Metadata Extractor for Books in a Digital Library
    Akhtar, Sk Simran
    Sanyal, Debarshi Kumar
    Chattopadhyay, Samiran
    Bhowmick, Plaban Kumar
    Das, Partha Pratim
    MATURITY AND INNOVATION IN DIGITAL LIBRARIES, ICADL 2018, 2018, 11279 : 323 - 327
  • [37] Web metadata extraction and semantic indexing for learning objects extraction
    Atkinson, John
    Gonzalez, Andrea
    Munoz, Mauricio
    Astudillo, Hernan
    APPLIED INTELLIGENCE, 2014, 41 (02) : 649 - 664
  • [38] Web metadata extraction and semantic indexing for learning objects extraction
    John Atkinson
    Andrea Gonzalez
    Mauricio Munoz
    Hernan Astudillo
    Applied Intelligence, 2014, 41 : 649 - 664
  • [39] Metadata Extraction from Conference Proceedings Using Template-Based Approach
    Kovriguina, Liubov
    Shipilo, Alexander
    Kozlov, Fedor
    Kolchin, Maxim
    Cherny, Eugene
    SEMANTIC WEB EVALUATION CHALLENGES, 2015, 548 : 153 - 164
  • [40] Ensemble approach for metadata extraction in Persian theses
    Beydaghi, Elham
    Rahnama, Mohadese
    Nasiri, Jalal A.
    2020 6TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2020,