Efficient deadline-aware scheduling for the analysis of Big Data streams in public Cloud

被引:14
作者
Mortazavi-Dehkordi, Mahmood [1 ]
Zamanifar, Kamran [1 ]
机构
[1] Univ Isfahan, Comp Engn Fac, Software Dept, Esfahan, Iran
来源
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS | 2020年 / 23卷 / 01期
关键词
Streaming Big Data analysis query; Deadline-aware scheduling; Cloud-based stream processing; REAL-TIME; RESOURCE-MANAGEMENT; SIMULATION;
D O I
10.1007/s10586-019-02908-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The emergence of Big Data has had a profound impact on how data are analyzed. Open source distributed stream processing platforms have gained popularity for analyzing streaming Big Data as they provide low latency required for streaming Big Data applications using Cloud resources. However, existing resource schedulers are still lacking the efficiency and deadline meeting that Big Data analytical applications require. Recent works have already considered streaming Big Data characteristics to improve the efficiency and the likelihood of deadline meeting for scheduling in the platforms. Nevertheless, they have not taken into account the specific attributes of analytical application, public Cloud utilization cost and delays caused by performance degradation of leasing public Cloud resources. This study, therefore, presents BCframework, an efficient deadline-aware scheduling framework used by streaming Big Data analysis applications based on public Cloud resources. BCframework proposes a scheduling model which considers public Cloud utilization cost, performance variation, deadline meeting and latency reduction requirements of streaming Big Data analytical applications. Furthermore, it introduces two operator scheduling algorithms based on both a novel partitioning algorithm and an operator replication method. BCframework is highly adaptable to the fluctuation of streaming Big Data and the performance degradation of public Cloud resources. Experiments with the benchmark and real-world queries show that BCframework can significantly reduce the latency and utilization cost and also minimize deadline violations and provisioned virtual machine instances.
引用
收藏
页码:241 / 263
页数:23
相关论文
共 47 条
  • [21] A benchmark approach and its toolkit for online scheduling of multiple deadline-constrained workflows in big-data processing systems
    Zhang, Dongzhan
    Yan, Wenjing
    Bugingo, Emmanuel
    Zheng, Wei
    Chen, Jinjun
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 85 : 222 - 234
  • [22] CSL-driven and energy-efficient resource scheduling in cloud data center
    Li, Hongjian
    Zhao, Yuyan
    Fang, Shuyong
    JOURNAL OF SUPERCOMPUTING, 2020, 76 (01) : 481 - 498
  • [23] Efficient ID-based public auditing for the outsourced data in cloud storage
    Zhang, Jianhong
    Dong, Qiaocui
    INFORMATION SCIENCES, 2016, 343 : 1 - 14
  • [24] Data-Importance Aware User Scheduling for Communication-Efficient Edge Machine Learning
    Liu, Dongzhu
    Zhu, Guangxu
    Zhang, Jun
    Huang, Kaibin
    IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2021, 7 (01) : 265 - 278
  • [25] Multilevel resource allocation for performance-aware energy-efficient cloud data centers
    Rossi, Fabio Diniz
    Severo de Souza, Paulo Silas
    Marques, Wagner dos Santos
    Conterato, Marcelo da Silva
    Ferreto, Tiago Coelho
    Lorenzon, Arthur Francisco
    Luizelli, Marcelo Caggiani
    2019 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (ISCC), 2019, : 462 - 467
  • [26] Energy-aware resource allocation heuristics for efficient management of data centers for Cloud computing
    Beloglazov, Anton
    Abawajy, Jemal
    Buyya, Rajkumar
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2012, 28 (05): : 755 - 768
  • [27] Green Cloud Computing: Efficient Energy-Aware and Dynamic Resources Management in Data Centers
    Diouani, Sara
    Medromi, Hicham
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (07) : 124 - 127
  • [28] Interference-Aware Workload Placement for Improving Latency Distribution of Converged HPC/Big Data Cloud Infrastructures
    Tzenetopoulos, Achilleas
    Masouros, Dimosthenis
    Xydis, Sotirios
    Soudris, Dimitrios
    EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION, SAMOS 2021, 2022, 13227 : 108 - 123
  • [29] Employing Vertical Elasticity for Efficient Big Data Processing in Container-Based Cloud Environments
    Choi, Jin-young
    Cho, Minkyoung
    Kim, Jik-Soo
    APPLIED SCIENCES-BASEL, 2021, 11 (13):
  • [30] Soft error-aware energy-efficient task scheduling for workflow applications in DVFS-enabled cloud
    Wu, Tingming
    Gu, Haifeng
    Zhou, Junlong
    Wei, Tongquan
    Liu, Xiao
    Chen, Mingsong
    JOURNAL OF SYSTEMS ARCHITECTURE, 2018, 84 : 12 - 27