Distributed Gaussian Mixture Model Summarization Using the MapReduce Framework

被引:0
作者
Esmaeilpour, Arina [1 ,2 ]
Bigdeli, Elnaz [2 ,3 ]
Cheraghchi, Fatemeh [2 ,3 ]
Raahemi, Bijan [2 ]
Far, Behrouz H. [1 ]
机构
[1] Univ Calgary, Dept Elect & Comp Engn, 2500 Univ Dr NW, Calgary, AB, Canada
[2] Univ Ottawa, Knowledge Discovery & Data Min Lab, Telfer Sch Management, 55 Laurier Ave E, Ottawa, ON, Canada
[3] Univ Ottawa, Dept Comp Sci, 600 King Edward, Ottawa, ON, Canada
来源
ADVANCES IN ARTIFICIAL INTELLIGENCE, AI 2016 | 2016年 / 9673卷
关键词
Distributed density-based clustering; Distributed cluster summarization; Gaussian mixture model; MapReduce; CLUSTERING-ALGORITHM; MR-DBSCAN;
D O I
10.1007/978-3-319-34111-8_39
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With an accelerating rate of data generation, sophisticated techniques are essential to meet scalability requirements. One of the promising avenues for handling large datasets is distributed storage and processing. Further, data summarization is a useful concept for managing large datasets, wherein a subset of the data can be used to provide an approximate yet useful representation. Consolidation of these tools can allow a distributed implementation of data summarization. In this paper, we achieve this by proposing and implementing a distributed Gaussian Mixture Model Summarization using the MapReduce framework (MR-SGMM). In MR-SGMM, we partition input data, cluster the data within each partition with a density-based clustering algorithm called DBSCAN, and for all clusters we discover SGMM core points and their features. We test the implementation with synthetic and real datasets to demonstrate its validity and efficiency. This paves the way for a scalable implementation of Summarization using Gaussian Mixture Model (SGMM).
引用
收藏
页码:323 / 335
页数:13
相关论文
共 50 条
[31]   Anomaly Intrusion Detection System Using Gaussian Mixture Model [J].
Bahrololum, M. ;
Khaleghi, A. .
THIRD 2008 INTERNATIONAL CONFERENCE ON CONVERGENCE AND HYBRID INFORMATION TECHNOLOGY, VOL 1, PROCEEDINGS, 2008, :1162-1167
[32]   Physical Layer Authentication Enhancement Using a Gaussian Mixture Model [J].
Qiu, Xiaoying ;
Jiang, Ting ;
Wu, Sheng ;
Hayes, Monson .
IEEE ACCESS, 2018, 6 :53583-53592
[33]   Tooth Segmentation Using Gaussian Mixture Model and Genetic Algorithm [J].
Kim, Joo Young ;
Yoo, Sun K. ;
Jang, W. S. ;
Park, Byung Eun ;
Park, Wonse ;
Kim, Kee Deog .
JOURNAL OF MEDICAL IMAGING AND HEALTH INFORMATICS, 2017, 7 (06) :1271-1276
[34]   Foreground Detection of Moving Object Using Gaussian Mixture Model [J].
Aslam, Nazia ;
Sharma, Veena .
2017 INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), 2017, :1071-1074
[35]   Text Independent Speaker Identification Using Gaussian Mixture Model [J].
Ting, Chee-Ming ;
Salleh, Sh-Hussain ;
Tan, Tian-Swee ;
Ariff, A. K. .
ICIAS 2007: INTERNATIONAL CONFERENCE ON INTELLIGENT & ADVANCED SYSTEMS, VOLS 1-3, PROCEEDINGS, 2007, :194-198
[36]   CLUSTERING OF CHILDHOOD DIARRHEA DISEASES USING GAUSSIAN MIXTURE MODEL [J].
Faidah, Defi yusti ;
Hudzaifa, Ashilla maula ;
Pontoh, Resa septiani .
COMMUNICATIONS IN MATHEMATICAL BIOLOGY AND NEUROSCIENCE, 2024,
[37]   Distributed Recommendation Algorithm Based on Matrix Decomposition on MapReduce Framework [J].
Wu, Sen ;
Lu, Dan ;
Du, Yannan ;
Feng, Xiaodong .
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2015, 2015, 9403 :447-457
[38]   DISTRIBUTED LOG ANALYSIS ON THE CLOUD USING MapReduce [J].
Aydin, Galip ;
Hallac, Ibrahim R. .
TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2016, 23 (04) :1011-1016
[39]   Gaussian Mixture Model and Gaussian Supervector for Image Classification [J].
Jiang, Yuechi ;
Leung, Frank H. F. .
2018 IEEE 23RD INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2018,
[40]   A joint finite mixture model for clustering genes from independent Gaussian and beta distributed data [J].
Xiaofeng Dai ;
Timo Erkkilä ;
Olli Yli-Harja ;
Harri Lähdesmäki .
BMC Bioinformatics, 10