Table understanding approaches for extracting knowledge from heterogeneous tables

被引:15
作者
Bonfitto, Sara [1 ]
Casiraghi, Elena [1 ]
Mesiti, Marco [1 ]
机构
[1] Univ Milan, Dept Comp Sci, Via Celoria 18, I-20133 Milan, Italy
关键词
generic tables; knowledge bases; machine learning; table understanding problem; TABULAR DATA; SPREADSHEETS; ARBITRARY; WEB;
D O I
10.1002/widm.1407
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Table understanding methods extract, transform, and interpret the information contained in tabular data embedded in documents/files of different formats. Such automatic understanding would allow to exploit tabular information with the aim of accurately answering queries, or integrating heterogeneous repositories of information in a common knowledge base, or exchanging information among different sources. The purpose of this survey is to provide a comprehensive analysis of the research efforts so far devoted to the problem of table understanding and to describe systems that support the transformation of heterogeneous tables into meaningful information. This article is categorized under: Application Areas > Data Mining Software Tools Technologies > Data Preprocessing Technologies > Structure Discovery and Clustering
引用
收藏
页数:26
相关论文
共 112 条
[1]   Header and unit inference for spreadsheets through spatial analyses [J].
Abraham, R ;
Erwig, M .
2004 IEEE SYMPOSIUM ON VISUAL LANGUAGES AND HUMAN CENTRIC COMPUTING: PROCEEDINGS, 2004, :165-172
[2]   UCheck: A spreadsheet type checker for end users [J].
Abraham, Robin ;
Erwig, Martin .
JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2007, 18 (01) :71-95
[3]   Schema Extraction for Tabular Data on the Web [J].
Adelfio, Marco D. ;
Samet, Hanan .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (06) :421-432
[4]   LEARNING REGULAR SETS FROM QUERIES AND COUNTEREXAMPLES [J].
ANGLUIN, D .
INFORMATION AND COMPUTATION, 1987, 75 (02) :87-106
[5]  
[Anonymous], 2012, R2RML RDB RDF MAPPIN
[6]  
[Anonymous], 2003, P 26 ANN INT ACM SIG
[7]  
Arenas M., 2010, RELATIONAL XML DATA
[8]   DBpedia: A nucleus for a web of open data [J].
Auer, Soeren ;
Bizer, Christian ;
Kobilarov, Georgi ;
Lehmann, Jens ;
Cyganiak, Richard ;
Ives, Zachary .
SEMANTIC WEB, PROCEEDINGS, 2007, 4825 :722-+
[9]  
Barowy DW, 2015, ACM SIGPLAN NOTICES, V50, P218, DOI [10.1145/2813885.2737952, 10.1145/2737924.2737952]
[10]  
Bellahsene Z, 2011, DATA CENTRIC SYST AP, P1, DOI 10.1007/978-3-642-16518-4