Incremental Knowledge Base Construction Using DeepDive

被引:121
作者
Shin, Jaeho [1 ]
Wu, Sen [1 ]
Wang, Feiran [1 ]
De Sa, Christopher [1 ]
Zhang, Ce [1 ,2 ]
Re, Christopher [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] Univ Wisconsin Madison, Madison, WI 53706 USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2015年 / 8卷 / 11期
基金
美国国家科学基金会;
关键词
D O I
10.14778/2809974.2809991
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Populating a database with unstructured information is a long-standing problem in industry and research that encompasses problems of extraction, cleaning, and integration. Recent names used for this problem include dealing with dark data and knowledge base construction (KBC). In this work, we describe Deep Dive, a system that combines database and machine learning ideas to help develop KBC systems, and we present techniques to make the KBC process more efficient. We observe that the KBC process is iterative, and we develop techniques to incrementally produce inference results for KBC systems. We propose two methods for incremental inference, based respectively on sampling and variational techniques. We also study the tradeoff space of these methods and develop a simple rule-based optimizer. Deep Dive includes all of these contributions, and we evaluate DeepDive on five KBC systems, showing that it can speed up KBC inference tasks by up to two orders of magnitude with negligible impact on quality.
引用
收藏
页码:1310 / 1321
页数:12
相关论文
共 45 条
[1]  
Acar U., 2008, UAI
[2]  
Andrieu C., 2003, MACHINE LEARNING
[3]  
Angeli G., 2014, TAC KBP
[4]  
Banerjee Onureena, 2008, JMLR
[5]  
Betteridge J., 2009, AAAI SPRING S
[6]  
Brin S., 1999, WEBDB
[7]  
Brown E., 2013, TOOLS METHODS BUILDI
[8]  
Carlson A., 2010, AAAI
[9]  
Chen F., 2012, ICDE
[10]  
CHEN F, 2008, ICDE