Reduction of Workflow Resource Consumption Using a Density-based Clustering Model

被引:7
|
作者
Zhang, Qimin [1 ]
Kremer-Herman, Nathaniel [2 ]
Tovar, Benjamin [2 ]
Thain, Douglas [2 ]
机构
[1] Chinese Acad Sci, Technol & Engn Ctr Space Utilizat, Key Lab Space Utilizat, Beijing, Peoples R China
[2] Univ Notre Dame, Dept Comp Sci & Engn, Notre Dame, IN 46556 USA
来源
PROCEEDINGS OF WORKS 2018: 13TH IEEE/ACM WORKSHOP ON WORKFLOWS IN SUPPORT OF LARGE-SCALE SCIENCE (WORKS) | 2018年
关键词
high throughput computing (HTC); density-based clustering; automatic resource allocation; resource consumption optimization;
D O I
10.1109/WORKS.2018.00006
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An end user running a scientific workflow will often ask for orders of magnitude too few or too many resources to run their workflow. If the resource requisition is too small, the job may fail due to resource exhaustion; if it is too large, resources will be wasted though job may succeed. It would be ideal to achieve a near-optimal number of resources the workflow runs to ensure all jobs succeed and minimize resource waste. We present a strategy for addressing this resource allocation problem: (1) resources consumed by each job are recorded by a resource monitor tool; (2) a density-based clustering model is proposed for discovering clusters in all jobs; (3) a maximal resource requisition is calculated as the ideal number of each cluster. We ran experiments with a synthetic workflow of homogeneous tasks as well as the bioinformatics tools Lifemapper, SHRIMP, BWA and BWA-GATK to capture the inherent nature of resource consumption of a workflow, the clustering allowed by the model, and its usefulness in real workflows. In Lifemapper, the least time, cores, memory, and disk savings are 13.82%, 16.62%, 49.15%, and 93.89%, respectively. In SHRIMP, BWA, and BWA-GATK, the least cores, memory, and disk savings are 50%, 90.14%, and 51.82%, respectively. Compared with fixed resource allocation strategy, our approach provide a noticeable reduction of workflow resource consumption.
引用
收藏
页码:1 / 9
页数:9
相关论文
共 50 条
  • [1] Parameter reduction for density-based clustering on large data sets
    Wang, BY
    Perrizo, W
    COMPUTER APPLICATIONS IN INDUSTRY AND ENGINEERING, 2004, : 181 - 186
  • [2] Enhancing density-based clustering: Parameter reduction and outlier detection
    Cassisi, Carmelo
    Ferro, Alfredo
    Giugno, Rosalba
    Pigola, Giuseppe
    Pulvirenti, Alfredo
    INFORMATION SYSTEMS, 2013, 38 (03) : 317 - 330
  • [3] Active Density-Based Clustering
    Mai, Son T.
    He, Xiao
    Hubig, Nina
    Plant, Claudia
    Boehm, Christian
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 508 - 517
  • [4] An Improved BAT Algorithm Using Density-Based Clustering
    Al-Asadi, Samraa Adnan
    Al-Mamory, Safaa O.
    INTELIGENCIA ARTIFICIAL-IBEROAMERICAL JOURNAL OF ARTIFICIAL INTELLIGENCE, 2023, 26 (72): : 102 - 123
  • [5] Shortest Path Deliveries Using Density-Based Clustering
    Fu, Lixin
    2017 TWELFTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT (ICDIM), 2017, : 21 - 26
  • [6] Video abstraction using density-based clustering algorithm
    Fereshteh Falah Chamasemani
    Lilly Suriani Affendey
    Norwati Mustapha
    Fatimah Khalid
    The Visual Computer, 2018, 34 : 1299 - 1314
  • [7] Video abstraction using density-based clustering algorithm
    Chamasemani, Fereshteh Falah
    Affendey, Lilly Suriani
    Mustapha, Norwati
    Khalid, Fatimah
    VISUAL COMPUTER, 2018, 34 (10) : 1299 - 1314
  • [8] An incremental density-based clustering framework using fuzzy local clustering
    Laohakiat, Sirisup
    Sa-ing, Vera
    INFORMATION SCIENCES, 2021, 547 : 404 - 426
  • [9] DECWA : Density-Based Clustering using Wasserstein Distance
    El Malki, Nabil
    Cugny, Robin
    Teste, Olivier
    Ravat, Franck
    CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 2005 - 2008
  • [10] Density-Based Clustering for Adaptive Density Variation
    Qian, Li
    Plant, Claudia
    Boehm, Christian
    2021 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2021), 2021, : 1282 - 1287