Tabular: A Schema-Driven Probabilistic Programming Language

被引:22
作者
Gordon, Andrew D. [1 ]
Graepel, Thore
Rolland, Nicolas
Russo, Claudio
Borgstroem, Johannes
Guiver, John
机构
[1] Univ Edinburgh, Edinburgh EH8 9YL, Midlothian, Scotland
关键词
Bayesian reasoning; machine learning; model-learner pattern; probabilistic programming; relational data;
D O I
10.1145/2535838.2535850
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We propose a new kind of probabilistic programming language for machine learning. We write programs simply by annotating existing relational schemas with probabilistic model expressions. We describe a detailed design of our language, Tabular, complete with formal semantics and type system. A rich series of examples illustrates the expressiveness of Tabular. We report an implementation, and show evidence of the succinctness of our notation relative to current best practice. Finally, we describe and verify a transformation of Tabular schemas so as to predict missing values in a concrete database. The ability to query for missing values provides a uniform interface to a wide variety of tasks, including classification, clustering, recommendation, and ranking.
引用
收藏
页码:321 / 334
页数:14
相关论文
共 33 条
[1]  
[Anonymous], 2007, Introduction to Statistical Relational Learning
[2]  
[Anonymous], CORR
[3]  
[Anonymous], 1994, J ROY STAT SOC D STA
[4]  
[Anonymous], 2004, P ICML 04 WORKSHOP S
[5]  
[Anonymous], P 2012 ACM SIGMOD IN
[6]  
Bachrach Y., 2012, P ICML 12
[7]  
Bhat S, 2013, LECT NOTES COMPUT SC, V7795, P508, DOI 10.1007/978-3-642-36742-7_35
[8]   Model-based machine learning [J].
Bishop, Christopher M. .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2013, 371 (1984)
[9]  
Borgström J, 2011, LECT NOTES COMPUT SC, V6602, P77, DOI 10.1007/978-3-642-19718-5_5
[10]   Probabilistic Databases: Diamonds in the Dirt [J].
Dalvi, Nilesh ;
Re, Christopher ;
Suciu, Dan .
COMMUNICATIONS OF THE ACM, 2009, 52 (07) :86-94