K-MLIO: Enabling K-Means for Large Data-sets and Memory Constrained Embedded Systems

被引:3
作者
Slimani, Camelia [1 ]
Rubini, Stephane [1 ]
Boukhobza, Jalil [1 ]
机构
[1] Univ Brest, Lab STICC, CNRS, UMR 6285, F-29200 Brest, France
来源
2019 IEEE 27TH INTERNATIONAL SYMPOSIUM ON MODELING, ANALYSIS, AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS (MASCOTS 2019) | 2019年
关键词
K-means; I/O optimization; embedded systems; machine learning;
D O I
10.1109/MASCOTS.2019.00037
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine Learning (ML) algorithms are increasingly used in embedded systems to perform different tasks such as clustering and pattern recognition. These algorithms are both compute and memory intensive whilst embedded devices offer lower hardware capabilities as compared to traditional ML platforms. K-means clustering is one of the widely used ML algorithms. In the case of large data-sets, our analysis showed that on average, more than 70% of the execution time is spent on I/Os. In this paper, we present a version of K-means that drastically reduces the number of I/Os by spanning the data-set only once as compared to the traditional version that reads it several times according to the number of iterations performed. Our evaluation showed that the proposed strategy reduces the overall execution time on large data-sets by 60% on average while lowering the number I/Os operations by 90% with a comparable precision to the traditional K-means implementation.
引用
收藏
页码:262 / 268
页数:7
相关论文
共 10 条
  • [1] Genetic Sampling k-means for Clustering Large Data Sets
    Luchi, Diego
    Santos, Willian
    Rodrigues, Alexandre
    Varejao, Flavio Miguel
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2015, 2015, 9423 : 691 - 698
  • [2] Zoning by k-means over a large data set
    Martinez, Carlos
    Lozano, Jesus
    de la Fuente, David
    Priore, Paolo
    Garcia, Nazario
    2013 12TH MEXICAN INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (MICAI 2013), 2013, : 65 - 69
  • [3] A novel K-means hierarchical clustering algorithm for efficient information extraction from large data sets
    Shahapurkar, SS
    Sundareshan, MK
    IKE'03: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2003, : 390 - 396
  • [4] Constrained k-means on cluster proportion and distances among clusters for longitudinal data analysis
    Usami, Satoshi
    JAPANESE PSYCHOLOGICAL RESEARCH, 2014, 56 (04) : 361 - 372
  • [5] Distance Constrained Data Clustering by Combined k-means Algorithms and Opinion Dynamics Filters
    Oliva, Gabriele
    La Manna, Damiano
    Fagiolini, Adriano
    Setola, Roberto
    2014 22ND MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION (MED), 2014, : 612 - 619
  • [6] Optimizing OpenCL Code for Performance on FPGA: k-Means Case Study With Integer Data Sets
    Paulino, Nuno
    Ferreira, Joao Canas
    Cardoso, Joao M. P.
    IEEE ACCESS, 2020, 8 : 152286 - 152304
  • [7] Data-driven prognostics using a combination of constrained K-means clustering, fuzzy modeling and LOF-based score
    Diez-Olivan, Alberto
    Pagan, Jose A.
    Sanz, Ricardo
    Sierra, Basilio
    NEUROCOMPUTING, 2017, 241 : 97 - 107
  • [8] Multi-Agents Approach for Data Mining Based k-Means for Improving the Decision Process in the ERP Systems
    Mesbahi, Nadjib
    Kazar, Okba
    Benharzallah, Saber
    Zoubeidi, Merouane
    INTERNATIONAL JOURNAL OF DECISION SUPPORT SYSTEM TECHNOLOGY, 2015, 7 (02) : 1 - 14
  • [9] Knowledge acquisition from in-operation data for water supply systems using pls regression and k-means method
    Matsuki, Hiroshi
    Fujimoto, Yasutaka
    IEEJ Transactions on Industry Applications, 2014, 134 (03) : 301 - 307
  • [10] Anomaly detection using K-Means and long-short term memory for predictive maintenance of large-scale solar (LSS) photovoltaic plant
    Zulfauzi, Irfan Adam
    Dahlan, Nofri Yenita
    Sintuya, Hathaithip
    Setthapun, Worajit
    ENERGY REPORTS, 2023, 9 : 154 - 158