Counting Markov equivalence classes for DAG models on trees

被引:8
作者
Radhakrishnan, Adityanarayanan [1 ,2 ]
Solus, Liam [3 ]
Uhler, Caroline [1 ,2 ]
机构
[1] MIT, Lab Informat & Decis Syst, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] MIT, Inst Data Syst & Soc, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[3] KTH Royal Inst Technol, Stockholm, Sweden
基金
美国国家科学基金会;
关键词
DAG model; Bayesian network; Markov equivalence class; Markov equivalence; Trees; Immoralities; DIRECTED ACYCLIC GRAPHS; BAYESIAN NETWORKS; ENUMERATION; DIGRAPHS;
D O I
10.1016/j.dam.2018.03.015
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
DAG models are statistical models satisfying a collection of conditional independence relations encoded by the nonedges of a directed acyclic graph (DAG) g. Such models are used to model complex cause-effect systems across a variety of research fields. From observational data alone, a DAG model g is only recoverable up to Markov equivalence. Combinatorially, two DAGs are Markov equivalent if and only if they have the same underlying undirected graph (i.e., skeleton) and the same set of the induced subDAGs i -> j <- k, known as immoralities. Hence it is of interest to study the number and size of Markov equivalence classes (MECs). In a recent paper, we introduced a pair of generating functions that enumerate the number of MECs on a fixed skeleton by number of immoralities and by class size, and we studied the complexity of computing these functions. In this paper, we lay the foundation for studying these generating functions by analyzing their structure for trees and other closely related graphs. We describe these polynomials for some well-studied families of graphs including paths, stars, cycles, spider graphs, caterpillars, and balanced binary trees. In doing so, we recover connections to independence polynomials, and extend some classical identities that hold for Fibonacci numbers. We also provide tight lower and upper bounds for the number and size of MECs on any tree. Finally, we use computational methods to show that the number and distribution of high degree nodes in a triangle-free graph dictate the number and size of MECs. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:170 / 185
页数:16
相关论文
共 28 条
  • [1] Bayesian networks in environmental modelling
    Aguilera, P. A.
    Fernandez, A.
    Fernandez, R.
    Rumi, R.
    Salmeron, A.
    [J]. ENVIRONMENTAL MODELLING & SOFTWARE, 2011, 26 (12) : 1376 - 1388
  • [2] Alavi Y., 1987, Congr. Numer., V58, P15
  • [3] Andersson SA, 1997, ANN STAT, V25, P505
  • [4] [Anonymous], P 17 C UNC ART INT
  • [5] [Anonymous], 2001, Causation, Prediction, and Search
  • [6] BOUNDS ON THE EXPECTED SIZE OF THE MAXIMUM AGREEMENT SUBTREE
    Bernstein, Daniel Irving
    Ho, Lam Si Tung
    Long, Colby
    Steel, Mike
    St John, Katherine
    Sullivant, Seth
    [J]. SIAM JOURNAL ON DISCRETE MATHEMATICS, 2015, 29 (04) : 2065 - 2074
  • [7] Branden Petter, 2015, HDB ENUMERATIVE COMB, P437
  • [8] r-Stable hypersimplices
    Braun, Benjamin
    Solus, Liam
    [J]. JOURNAL OF COMBINATORIAL THEORY SERIES A, 2018, 157 : 349 - 388
  • [9] Drton Mathias, 2008, Lectures on algebraic statistics, V39
  • [10] Using Bayesian networks to analyze expression data
    Friedman, N
    Linial, M
    Nachman, I
    Pe'er, D
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (3-4) : 601 - 620