Distributed Gaussian Mixture Model Summarization Using the MapReduce Framework

被引:0
作者
Esmaeilpour, Arina [1 ,2 ]
Bigdeli, Elnaz [2 ,3 ]
Cheraghchi, Fatemeh [2 ,3 ]
Raahemi, Bijan [2 ]
Far, Behrouz H. [1 ]
机构
[1] Univ Calgary, Dept Elect & Comp Engn, 2500 Univ Dr NW, Calgary, AB, Canada
[2] Univ Ottawa, Knowledge Discovery & Data Min Lab, Telfer Sch Management, 55 Laurier Ave E, Ottawa, ON, Canada
[3] Univ Ottawa, Dept Comp Sci, 600 King Edward, Ottawa, ON, Canada
来源
ADVANCES IN ARTIFICIAL INTELLIGENCE, AI 2016 | 2016年 / 9673卷
关键词
Distributed density-based clustering; Distributed cluster summarization; Gaussian mixture model; MapReduce; CLUSTERING-ALGORITHM; MR-DBSCAN;
D O I
10.1007/978-3-319-34111-8_39
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With an accelerating rate of data generation, sophisticated techniques are essential to meet scalability requirements. One of the promising avenues for handling large datasets is distributed storage and processing. Further, data summarization is a useful concept for managing large datasets, wherein a subset of the data can be used to provide an approximate yet useful representation. Consolidation of these tools can allow a distributed implementation of data summarization. In this paper, we achieve this by proposing and implementing a distributed Gaussian Mixture Model Summarization using the MapReduce framework (MR-SGMM). In MR-SGMM, we partition input data, cluster the data within each partition with a density-based clustering algorithm called DBSCAN, and for all clusters we discover SGMM core points and their features. We test the implementation with synthetic and real datasets to demonstrate its validity and efficiency. This paves the way for a scalable implementation of Summarization using Gaussian Mixture Model (SGMM).
引用
收藏
页码:323 / 335
页数:13
相关论文
共 50 条
[21]   A Framework for Managing MapReduce Applications in Dynamic Distributed Environments [J].
Marozzo, Fabrizio ;
Talia, Domenico ;
Trunfio, Paolo .
PROCEEDINGS OF THE 19TH INTERNATIONAL EUROMICRO CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING, 2011, :149-158
[22]   Finite Gaussian Mixture Model Based Multimodeling for Nonlinear Distributed Parameter Systems [J].
Xu, Kangkang ;
Yang, Haidong ;
Zhu, Chengjiu ;
Hu, Luoke .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2020, 16 (03) :1754-1763
[23]   Distributed Gaussian Mixture Model for Monitoring Multimode Plant-wide Process [J].
Zhu, Jinlin ;
Ge, Zhiqiang ;
Song, Zhihuan .
PROCEEDINGS OF THE 28TH CHINESE CONTROL AND DECISION CONFERENCE (2016 CCDC), 2016, :5826-5831
[24]   Distributed Data Management Using MapReduce [J].
Li, Feng ;
Ooi, Beng Chin ;
Oezsu, M. Tamer ;
Wu, Sai .
ACM COMPUTING SURVEYS, 2014, 46 (03)
[25]   Combining Multi Classifiers Based on a Genetic Algorithm - A Gaussian Mixture Model Framework [J].
Tien Thanh Nguyen ;
Alan Wee-Chung Liew ;
Minh Toan Tran ;
Mai Phuong Nguyen .
INTELLIGENT COMPUTING METHODOLOGIES, 2014, 8589 :56-67
[26]   A new compressive sensing video coding framework based on Gaussian mixture model [J].
Li, Xiangwei ;
Lan, Xuguang ;
Yang, Meng ;
Xue, Jianru ;
Zheng, Nanning .
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2017, 55 :66-79
[27]   An Improved Gaussian Mixture Model [J].
Gong Dayong ;
Wang Zhihua .
INTERNATIONAL CONFERENCE ON GRAPHIC AND IMAGE PROCESSING (ICGIP 2012), 2013, 8768
[28]   A framework for building hypercubes using MapReduce [J].
Tapiador, D. ;
O'Mullane, W. ;
Brown, A. G. A. ;
Luri, X. ;
Huedo, E. ;
Osuna, P. .
COMPUTER PHYSICS COMMUNICATIONS, 2014, 185 (05) :1429-1438
[29]   Earthquake Phase Association Using a Bayesian Gaussian Mixture Model [J].
Zhu, Weiqiang ;
McBrearty, Ian W. ;
Mousavi, S. Mostafa ;
Ellsworth, William L. ;
Beroza, Gregory C. .
JOURNAL OF GEOPHYSICAL RESEARCH-SOLID EARTH, 2022, 127 (05)
[30]   Automatic shot boundary detection using Gaussian Mixture Model [J].
Reddy, A. Adhipathi ;
Varadharajan, Sridhar .
VISAPP 2008: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 1, 2008, :547-550