Marginal Pseudo-Likelihood Learning of Discrete Markov Network Structures

被引:13
作者
Pensar, Johan [1 ]
Nyman, Henrik [1 ]
Niiranen, Juha [2 ]
Corander, Jukka [2 ,3 ]
机构
[1] Abo Akad Univ, Dept Math & Stat, Turku, Finland
[2] Univ Helsinki, Dept Math & Stat, Helsinki, Finland
[3] Univ Oslo, Dept Biostat, Oslo, Norway
来源
BAYESIAN ANALYSIS | 2017年 / 12卷 / 04期
基金
芬兰科学院;
关键词
Markov networks; structure learning; pseudo-likelihood; non-chordal graph; Bayesian inference; regularization; CONTEXT-SPECIFIC INDEPENDENCE; ISING-MODEL SELECTION; RANDOM-FIELDS; BAYESIAN NETWORKS; GRAPHICAL MODELS; TREES;
D O I
10.1214/16-BA1032
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Markov networks are a popular tool for modeling multivariate distributions over a set of discrete variables. The core of the Markov network representation is an undirected graph which elegantly captures the dependence structure over the variables. Traditionally, the Bayesian approach of learning the graph structure from data has been done under the assumption of chordality since non-chordal graphs are difficult to evaluate for likelihood-based scores. Recently, there has been a surge of interest towards the use of regularized pseudo-likelihood methods as such approaches can avoid the assumption of chordality. Many of the currently available methods necessitate the use of a tuning parameter to adapt the level of regularization for a particular dataset. Here we introduce the marginal pseudo-likelihood which has a built-in regularization through marginalization over the graph-specific nuisance parameters. We prove consistency of the resulting graph estimator via comparison with the pseudo-Bayesian information criterion. To identify high-scoring graph structures in a high-dimensional setting we design a two-step algorithm that exploits the decomposable structure of the score. Using synthetic and existing benchmark networks, the marginal pseudo-likelihood method is shown to perform favorably against recent popular structure learning methods.
引用
收藏
页码:1195 / 1215
页数:21
相关论文
共 38 条
[11]   Efficient Markov Network Structure Discovery Using Independence Tests [J].
Bromberg, Facundo ;
Margaritis, Dimitris ;
Honavar, Vasant .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2009, 35 :449-484
[12]   APPROXIMATING DISCRETE PROBABILITY DISTRIBUTIONS WITH DEPENDENCE TREES [J].
CHOW, CK ;
LIU, CN .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1968, 14 (03) :462-+
[13]  
Corander J., 2013, Advances in Neural Information Processing Systems, P1349
[14]   Consistent estimation of the basic neighborhood of Markov random fields [J].
Csiszár, I ;
Talata, Z .
ANNALS OF STATISTICS, 2006, 34 (01) :123-145
[15]   Inducing features of random fields [J].
DellaPietra, S ;
DellaPietra, V ;
Lafferty, J .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (04) :380-393
[16]   Improved contact prediction in proteins: Using pseudolikelihoods to infer Potts models [J].
Ekeberg, Magnus ;
Lovkvist, Cecilia ;
Lan, Yueheng ;
Weigt, Martin ;
Aurell, Erik .
PHYSICAL REVIEW E, 2013, 87 (01)
[17]   Dependency networks for inference, collaborative filtering, and data visualization [J].
Heckerman, D ;
Chickering, DM ;
Meek, C ;
Rounthwaite, R ;
Kadie, C .
JOURNAL OF MACHINE LEARNING RESEARCH, 2001, 1 (01) :49-75
[18]   LEARNING BAYESIAN NETWORKS - THE COMBINATION OF KNOWLEDGE AND STATISTICAL-DATA [J].
HECKERMAN, D ;
GEIGER, D ;
CHICKERING, DM .
MACHINE LEARNING, 1995, 20 (03) :197-243
[19]  
Höfling H, 2009, J MACH LEARN RES, V10, P883
[20]  
Ji CS, 1996, ANN APPL PROBAB, V6, P423