On Model Selection, Bayesian Networks, and the Fisher Information Integral

被引：0

作者：

Yuan Zou

Teemu Roos

机构：

[1] University of Helsinki,Helsinki Institute for Information Technology HIIT, Department of Computer Science

来源：

New Generation Computing | 2017年 / 35卷

关键词：

Model selection; Bayesian networks; Fisher information approximation; NML; BIC;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

We study BIC-like model selection criteria and in particular, their refinements that include a constant term involving the Fisher information matrix. We perform numerical simulations that enable increasingly accurate approximation of this constant in the case of Bayesian networks. We observe that for complex Bayesian network models, the constant term is a negative number with a very large absolute value that dominates the other terms for small and moderate sample sizes. For networks with a fixed number of parameters, d, the leading term in the complexity penalty, which is proportional to d, is the same. However, as we show, the constant term can vary significantly depending on the network structure even if the number of parameters is fixed. Based on our experiments, we conjecture that the distribution of the nodes’ outdegree is a key factor. Furthermore, we demonstrate that the constant term can have a dramatic effect on model selection performance for small sample sizes.

引用

页码：5 / 27

页数：22

共 25 条

[1]

Clarke BS(1994)Jeffreys prior is asymptotically least favorable under entropy risk J. Stat. Plan. Inference 41 37-61

[2]

Barron AR(2001)Markov chain Monte Carlo methods for computing Bayes factors J. Am. Stat. Assoc. 96 1122-1132

[3]

Han C(1946)An invariant form for the prior probability in estimation problems J. R. Stat. Soc. A. 186 453-461

[4]

Carlin BP(1995)Bayes factors J. Am. Stat. Assoc. 90 773-795

[5]

Jeffreys H(2007)A linear-time algorithm for computing the multinomial stochastic complexity Inf. Process. Lett. 103 227-233

[6]

Kass RE(2000)On predictive distributions and Bayesian networks Stat. Comput. 10 39-54

[7]

Raftery AE(1981)The performance of universal coding IEEE Trans. Inf. Theory 27 199-207

[8]

Kontkanen P(2004)A note on the applied use of MDL approximations Neural Comput. 16 1763-1768

[9]

Myllymäki P(1996)Fisher information and stochastic complexity IEEE Trans. Inf. Theory 42 40-47

[10]

Kontkanen P(1978)Estimating the dimension of a model Ann. Stat. 6 461-464

← 1 2 3 →