Duplicate Detection for Bayesian Network Structure Learning

被引:1
作者
Jahnsson, Niklas [1 ]
Malone, Brandon [2 ,3 ]
Myllymaki, Petri [4 ]
机构
[1] Univ Helsinki, Helsinki, Finland
[2] Heidelberg Univ, Dept Internal Med 3, Sect Bioinformat & Syst Cardiol, Heidelberg, Germany
[3] Heidelberg Univ, Klaus Tschira Inst Integrat Computat Cardiol, Heidelberg, Germany
[4] Univ Helsinki, Helsinki Inst Informat Technol, Helsinki, Finland
关键词
Bayesian networks; Structure learning; State space search; Delayed duplicate detection; Structured duplicate detection; ALGORITHM;
D O I
10.1007/s00354-016-0004-9
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We address the well-known score-based Bayesian network structure learning problem. Breadth-first branch and bound (BFBnB) has been shown to be an effective approach for solving this problem. Duplicate detection is an important component of the BFBnB algorithm. Previously, an external sorting-based technique was used for delayed duplicate detection (DDD). We propose a hashing-based technique for DDD and a bin packing algorithm for minimizing the number of external memory files and operations. We also give a structured duplicate detection approach which completely eliminates DDD. Importantly, these techniques ensure the search algorithms respect any given memory bound. Empirically, we demonstrate that structured duplicate detection is significantly faster than the previous state of the art in limited-memory settings. Our results show that the bin packing algorithm incurs some overhead, but that the overhead is offset by reducing I/O when more memory is available.
引用
收藏
页码:47 / 67
页数:21
相关论文
共 36 条
[1]  
[Anonymous], P 28 AAAI C ART INT
[2]  
[Anonymous], 2006, P 22 C UNCERTAINTY A
[3]  
Bartlett M., 2015, ARTIF INTEL IN PRESS
[4]  
Chickering D., 1996, Learning from data: Artificial intelligence and statistics V, P121, DOI 10.1007/978-1-4612-2404-4_12
[5]  
COOPER GF, 1992, MACH LEARN, V9, P309, DOI 10.1007/BF00994110
[6]  
de Campos CP, 2011, J MACH LEARN RES, V12, P663
[7]   A new approach for learning belief networks using independence criteria [J].
de Campos, LM ;
Huete, JF .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2000, 24 (01) :11-37
[8]  
Edelkamp S, 2012, HEURISTIC SEARCH: THEORY AND APPLICATIONS, P1
[9]  
Fan XN, 2014, AAAI CONF ARTIF INTE, P2439
[10]  
Garey M. R., 1979, Computers and intractability. A guide to the theory of NP-completeness