Efficient deadline-aware scheduling for the analysis of Big Data streams in public Cloud

被引:13
作者
Mortazavi-Dehkordi, Mahmood [1 ]
Zamanifar, Kamran [1 ]
机构
[1] Univ Isfahan, Comp Engn Fac, Software Dept, Esfahan, Iran
来源
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS | 2020年 / 23卷 / 01期
关键词
Streaming Big Data analysis query; Deadline-aware scheduling; Cloud-based stream processing; REAL-TIME; RESOURCE-MANAGEMENT; SIMULATION;
D O I
10.1007/s10586-019-02908-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The emergence of Big Data has had a profound impact on how data are analyzed. Open source distributed stream processing platforms have gained popularity for analyzing streaming Big Data as they provide low latency required for streaming Big Data applications using Cloud resources. However, existing resource schedulers are still lacking the efficiency and deadline meeting that Big Data analytical applications require. Recent works have already considered streaming Big Data characteristics to improve the efficiency and the likelihood of deadline meeting for scheduling in the platforms. Nevertheless, they have not taken into account the specific attributes of analytical application, public Cloud utilization cost and delays caused by performance degradation of leasing public Cloud resources. This study, therefore, presents BCframework, an efficient deadline-aware scheduling framework used by streaming Big Data analysis applications based on public Cloud resources. BCframework proposes a scheduling model which considers public Cloud utilization cost, performance variation, deadline meeting and latency reduction requirements of streaming Big Data analytical applications. Furthermore, it introduces two operator scheduling algorithms based on both a novel partitioning algorithm and an operator replication method. BCframework is highly adaptable to the fluctuation of streaming Big Data and the performance degradation of public Cloud resources. Experiments with the benchmark and real-world queries show that BCframework can significantly reduce the latency and utilization cost and also minimize deadline violations and provisioned virtual machine instances.
引用
收藏
页码:241 / 263
页数:23
相关论文
共 47 条
  • [31] Big data-driven scheduling optimization algorithm for Cyber-Physical Systems based on a cloud platform
    Niu, Chao
    Wang, Lizhou
    [J]. COMPUTER COMMUNICATIONS, 2022, 181 : 173 - 181
  • [32] A Novel Resource Scheduler for Resource Allocation and Scheduling in Big Data Using Hybrid Optimization Algorithm at Cloud Environment
    Selvaraj, Aarthee
    Rajendran, Prabakaran
    Rajangam, Kanimozhi
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2023, 20 (06) : 863 - 873
  • [33] Fast-FFA: a fast online scheduling approach for big data stream computing with future features-aware
    Sun, Dawei
    Tang, Hao
    [J]. INTERNATIONAL JOURNAL OF BIO-INSPIRED COMPUTATION, 2017, 10 (03) : 205 - 217
  • [34] Cloud Resource Management for Image and Video Analysis of Big Data from Network Cameras
    Kaseb, Ahmed S.
    Mohan, Anup
    Lu, Yung-Hsiang
    [J]. 2015 INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA (CCBD), 2015, : 287 - 294
  • [35] An efficient power-aware VM allocation mechanism in cloud data centers: a micro genetic-based approach
    Tarahomi, Mehran
    Izadi, Mohammad
    Ghobaei-Arani, Mostafa
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2021, 24 (02): : 919 - 934
  • [36] Simulation and modeling in cloud computing-based smart grid power big data analysis technology
    Padmanaban, K.
    Kalpana, Y. Baby
    Geetha, M.
    Balan, K.
    Mani, V.
    Sivaraju, S. S.
    [J]. INTERNATIONAL JOURNAL OF MODELING SIMULATION AND SCIENTIFIC COMPUTING, 2024,
  • [37] Re-Stream: Real-time and energy-efficient resource scheduling in big data stream computing environments
    Sun, Dawei
    Zhang, Guangyan
    Yang, Songlin
    Meng, Weimin
    Khan, Samee U.
    Li, Keqin
    [J]. INFORMATION SCIENCES, 2015, 319 : 92 - 112
  • [38] H2O-Cloud: A Resource and Quality of Service-Aware Task Scheduling Framework for Warehouse-Scale Data Centers
    Cheng, Mingxi
    Li, Ji
    Bogdan, Paul
    Nazarian, Shahin
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (10) : 2925 - 2937
  • [39] Recent implications towards sustainable and energy efficient AI and big data implementations in cloud-fog systems: A newsworthy inquiry
    Ikhlasse, Hamzaoui
    Benjamin, Duthil
    Vincent, Courboulay
    Hicham, Medromi
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (10) : 8867 - 8887
  • [40] THE ZIG-ZAG PROCESS AND SUPER-EFFICIENT SAMPLING FOR BAYESIAN ANALYSIS OF BIG DATA
    Ierkens, Joris B.
    Fearnhead, Paul
    Roberts, Gareth
    [J]. ANNALS OF STATISTICS, 2019, 47 (03) : 1288 - 1320