Distributed Gaussian Mixture Model Summarization Using the MapReduce Framework

被引:0
作者
Esmaeilpour, Arina [1 ,2 ]
Bigdeli, Elnaz [2 ,3 ]
Cheraghchi, Fatemeh [2 ,3 ]
Raahemi, Bijan [2 ]
Far, Behrouz H. [1 ]
机构
[1] Univ Calgary, Dept Elect & Comp Engn, 2500 Univ Dr NW, Calgary, AB, Canada
[2] Univ Ottawa, Knowledge Discovery & Data Min Lab, Telfer Sch Management, 55 Laurier Ave E, Ottawa, ON, Canada
[3] Univ Ottawa, Dept Comp Sci, 600 King Edward, Ottawa, ON, Canada
来源
ADVANCES IN ARTIFICIAL INTELLIGENCE, AI 2016 | 2016年 / 9673卷
关键词
Distributed density-based clustering; Distributed cluster summarization; Gaussian mixture model; MapReduce; CLUSTERING-ALGORITHM; MR-DBSCAN;
D O I
10.1007/978-3-319-34111-8_39
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With an accelerating rate of data generation, sophisticated techniques are essential to meet scalability requirements. One of the promising avenues for handling large datasets is distributed storage and processing. Further, data summarization is a useful concept for managing large datasets, wherein a subset of the data can be used to provide an approximate yet useful representation. Consolidation of these tools can allow a distributed implementation of data summarization. In this paper, we achieve this by proposing and implementing a distributed Gaussian Mixture Model Summarization using the MapReduce framework (MR-SGMM). In MR-SGMM, we partition input data, cluster the data within each partition with a density-based clustering algorithm called DBSCAN, and for all clusters we discover SGMM core points and their features. We test the implementation with synthetic and real datasets to demonstrate its validity and efficiency. This paves the way for a scalable implementation of Summarization using Gaussian Mixture Model (SGMM).
引用
收藏
页码:323 / 335
页数:13
相关论文
共 50 条
  • [21] Finite Gaussian Mixture Model Based Multimodeling for Nonlinear Distributed Parameter Systems
    Xu, Kangkang
    Yang, Haidong
    Zhu, Chengjiu
    Hu, Luoke
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2020, 16 (03) : 1754 - 1763
  • [22] Distributed Gaussian Mixture Model for Monitoring Multimode Plant-wide Process
    Zhu, Jinlin
    Ge, Zhiqiang
    Song, Zhihuan
    [J]. PROCEEDINGS OF THE 28TH CHINESE CONTROL AND DECISION CONFERENCE (2016 CCDC), 2016, : 5826 - 5831
  • [23] Distributed Data Management Using MapReduce
    Li, Feng
    Ooi, Beng Chin
    Oezsu, M. Tamer
    Wu, Sai
    [J]. ACM COMPUTING SURVEYS, 2014, 46 (03)
  • [24] Combining Multi Classifiers Based on a Genetic Algorithm - A Gaussian Mixture Model Framework
    Tien Thanh Nguyen
    Alan Wee-Chung Liew
    Minh Toan Tran
    Mai Phuong Nguyen
    [J]. INTELLIGENT COMPUTING METHODOLOGIES, 2014, 8589 : 56 - 67
  • [25] A new compressive sensing video coding framework based on Gaussian mixture model
    Li, Xiangwei
    Lan, Xuguang
    Yang, Meng
    Xue, Jianru
    Zheng, Nanning
    [J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2017, 55 : 66 - 79
  • [26] An Improved Gaussian Mixture Model
    Gong Dayong
    Wang Zhihua
    [J]. INTERNATIONAL CONFERENCE ON GRAPHIC AND IMAGE PROCESSING (ICGIP 2012), 2013, 8768
  • [27] A framework for building hypercubes using MapReduce
    Tapiador, D.
    O'Mullane, W.
    Brown, A. G. A.
    Luri, X.
    Huedo, E.
    Osuna, P.
    [J]. COMPUTER PHYSICS COMMUNICATIONS, 2014, 185 (05) : 1429 - 1438
  • [28] Earthquake Phase Association Using a Bayesian Gaussian Mixture Model
    Zhu, Weiqiang
    McBrearty, Ian W.
    Mousavi, S. Mostafa
    Ellsworth, William L.
    Beroza, Gregory C.
    [J]. JOURNAL OF GEOPHYSICAL RESEARCH-SOLID EARTH, 2022, 127 (05)
  • [29] Automatic shot boundary detection using Gaussian Mixture Model
    Reddy, A. Adhipathi
    Varadharajan, Sridhar
    [J]. VISAPP 2008: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 1, 2008, : 547 - 550
  • [30] Anomaly Intrusion Detection System Using Gaussian Mixture Model
    Bahrololum, M.
    Khaleghi, A.
    [J]. THIRD 2008 INTERNATIONAL CONFERENCE ON CONVERGENCE AND HYBRID INFORMATION TECHNOLOGY, VOL 1, PROCEEDINGS, 2008, : 1162 - 1167