Adaptive Bayesian Network Structure Learning from Big Datasets

被引:0
作者
Tang, Yan [1 ]
Zhang, Qidong [1 ]
Liu, Huaxin [1 ]
Wang, Wangsong [1 ]
机构
[1] Hohai Univ, Coll Comp & Informat, Nanjing 210098, Jiangsu, Peoples R China
来源
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2017) | 2017年 / 10179卷
关键词
Bayesian network structure learning; Bayesian score; Big data sampling; Ensemble method;
D O I
10.1007/978-3-319-55705-2_12
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Since big data contain more comprehensive probability distributions and richer causal relationships than conventional small datasets, discovering Bayesian network (BN) structure from big datasets is becoming more and more valuable for modeling and reasoning under uncertainties in many areas. Facing big data, most of the current BN structure learning algorithms have limitations. First, learning BNs structure from big datasets is an expensive process that requires high computational cost, often ending in failure. Second, given any dataset as input, it is very difficult to choose one algorithm from numerous candidates for consistently achieving good learning accuracy. To address these issues, we introduce a novel approach called Adaptive Bayesian network Learning (ABNL). ABNL begins with an adaptive sampling process that extracts a sufficiently large data partition from any big dataset for fast structure learning. Then, ABNL feeds the data partition to different learning algorithms to obtain a collection of BN Structures. Lastly, ABNL adaptively chooses the structures and merge them into a final network structure using an ensemble method. Experimental results on four big datasets show that ABNL leads to a significantly improved performance than whole dataset learning and more accurate results than baseline algorithms.
引用
收藏
页码:158 / 168
页数:11
相关论文
共 24 条
[1]   Hailfinder: A Bayesian system for forecasting severe weather [J].
Abramson, B ;
Brown, J ;
Edwards, W ;
Murphy, A ;
Winkler, RL .
INTERNATIONAL JOURNAL OF FORECASTING, 1996, 12 (01) :57-71
[2]   Efficient Machine Learning for Big Data: A Review [J].
Al-Jarrah, Omar Y. ;
Yoo, Paul D. ;
Muhaidat, Sami ;
Karagiannidis, George K. ;
Taha, Kamal .
BIG DATA RESEARCH, 2015, 2 (03) :87-93
[3]  
[Anonymous], 2003, LEARNING BAYESIAN NE
[4]  
[Anonymous], 2007, Encyclopedia of Statistics in Quality Reliability
[5]  
Beinlich I. A., 1989, AIME 89. Second European Conference on Artificial Intelligence in Medicine Proceedings, P247
[6]   Adaptive probabilistic networks with hidden variables [J].
Binder, J ;
Koller, D ;
Russell, S ;
Kanazawa, K .
MACHINE LEARNING, 1997, 29 (2-3) :213-244
[7]   Learning Bayesian networks from data: An information-theory based approach [J].
Cheng, J ;
Greiner, R ;
Kelly, J ;
Bell, D ;
Liu, WR .
ARTIFICIAL INTELLIGENCE, 2002, 137 (1-2) :43-90
[8]  
Chickering DM, 2004, J MACH LEARN RES, V5, P1287
[9]   LEARNING BAYESIAN NETWORKS - THE COMBINATION OF KNOWLEDGE AND STATISTICAL-DATA [J].
HECKERMAN, D ;
GEIGER, D ;
CHICKERING, DM .
MACHINE LEARNING, 1995, 20 (03) :197-243
[10]   Sampled Bayesian Network Classifiers for Class-Imbalance and Cost-Sensitive Learning [J].
Jiang, Liangxiao ;
Li, Chaoqun ;
Cai, Zhihua ;
Zhang, Harry .
2013 IEEE 25TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2013, :512-517