Learning Markov Blanket Bayesian Network for Big Data in MapReduce

被引:0
|
作者
Che, Yuxin [1 ]
Hong, Shaohui [1 ]
Zhang, Defu [1 ]
Zhang, Liming [2 ]
机构
[1] Xiamen Univ, Dept Comp Sci, Xiamen 361005, Peoples R China
[2] Univ Macau, Dept Comp Informat Sci, Macau, Peoples R China
关键词
Big Data; MapReduce; Bayesian Network; Markov blanket; Data Mining; CLASSIFICATION;
D O I
10.1109/ICTAI.2016.135
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A challenge task of data mining is to process massive data in the big data era. MapReduce is an attractive model to overcome this challenge. This paper presents a new method to accelerate the process of learning Markov blanket Bayesian network(MBBN). Markov blanket is a better model type of Bayesian network in some complex datasets. The time and space cost of learning Markov blanket is large, and grows fast as the variables increase. Large amounts of data are needed for its independence test which makes the problem harder. The statistical phase and independence test are parallelized to make it find an appropriate relation among variables in the MapReduce framework. Computational results are reported by testing four datasets and show that the speed-up can be obtained by means of MapReduce. In particular, the Markov blanket in MapReduce has higher accuracy rate than naive Bayesian and tree-augmented naive Bayesian.
引用
收藏
页码:896 / 900
页数:5
相关论文
共 50 条
  • [31] Telescopic broad Bayesian learning for big data stream
    Yuen, Ka-Veng
    Kuok, Sin-Chi
    COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2025, 40 (01) : 33 - 53
  • [32] MapReduce Algorithms for Big Data Analysis
    Shim, Kyuseok
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 2016 - 2017
  • [33] A Bayesian perspective of statistical machine learning for big data
    Rajiv Sambasivan
    Sourish Das
    Sujit K. Sahu
    Computational Statistics, 2020, 35 : 893 - 930
  • [34] MapReduce Research on Warehousing of Big Data
    Pticek, M.
    Vrdoljak, B.
    2017 40TH INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2017, : 1361 - 1366
  • [35] MapReduce Algorithms for Big Data Analysis
    Shim, Kyuseok
    DATABASES THEORY AND APPLICATIONS, ADC 2018, 2018, 10837 : XV - XV
  • [36] Scalable Statistical Learning: A Modular Bayesian/Markov Network Approach
    Freno, Antonino
    Trentin, Edmondo
    Gori, Marco
    IJCNN: 2009 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1- 6, 2009, : 523 - 530
  • [37] Prominence of MapReduce in BIG DATA Processing
    Pandey, Shweta
    Tokekar, Vrinda
    2014 FOURTH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT), 2014, : 555 - 560
  • [38] A hybrid approach for learning Markov Equivalence Classes of Bayesian Network
    Jia, Haiyang
    Liu, Dayou
    Chen, Juan
    Liu, Xin
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, 2007, 4798 : 611 - 616
  • [39] Detecting straggler MapReduce tasks in big data processing infrastructure by neural network
    Javadpour, Amir
    Wang, Guojun
    Rezaei, Samira
    Li, Kuan-Ching
    JOURNAL OF SUPERCOMPUTING, 2020, 76 (09): : 6969 - 6993
  • [40] Bayesian Network Structure Learning from Big Data: A Reservoir Sampling Based Ensemble Method
    Tang, Yan
    Xu, Zhuoming
    Zhuang, Yuanhang
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2016, 2016, 9645 : 209 - 222