IPSO: A Scaling Model for Data-Intensive Applications

被引:1
|
作者
Li, Zhongwei [1 ]
Duan, Feng [1 ]
Minh Nguyen [1 ]
Che, Hao [1 ]
Lei, Yu [1 ]
Jiang, Hong [1 ]
机构
[1] Univ Texas Arlington, Dept Comp Sci & Engn, Arlington, TX 76019 USA
关键词
scale-out workload; cloud computing; speedup; performance evaluation; Amdahl's Law; Gustafson's Law; AMDAHLS LAW;
D O I
10.1109/ICDCS.2019.00032
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Today's data center applications are predominantly data-intensive, calling for scaling out the workload to a large number of servers for parallel processing. Unfortunately, the existing scaling laws, notably, Amdahl's and Gustafson's laws are inadequate to characterize the scaling properties of dataintensive workloads. To fill this void, in this paper, we put forward a new scaling model, called In-Proportion and Scale-Out-induced scaling model (IPSO). IPSO generalizes the existing scaling models in two important aspects. First, it accounts for the possible in-proportion scaling, i.e., the scaling of the serial portion of the workload in proportion to the scaling of the parallelizable portion of the workload. Second, it takes into account the possible scaleout-induced scaling, i.e., the scaling of the collective overhead or workload induced by scaling out. IPSO exposes scaling properties of data-intensive workloads, rendering the existing scaling laws its special cases. In particular, IPSO reveals two new pathological scaling properties. Namely, the speedup may level off even in the case of the fixed-time workload underlying Gustafson's law, and it may peak and then fall as the system scales out. Extensive MapReduce and Spark-based case studies demonstrate that IPSO successfully captures diverse scaling properties of data-intensive applications. As a result, it can serve as a diagnostic tool to gain insights on or even uncover counter-intuitive root causes of observed scaling behaviors, especially pathological ones, for data-intensive applications. Finally, preliminary results also demonstrate the promising prospects of IPSO to facilitate effective resource provisioning to achieve the best speedup-versus-cost tradeoffs for data-intensive applications.
引用
收藏
页码:238 / 248
页数:11
相关论文
共 50 条
  • [31] A framework for the internationalization of data-intensive Web applications
    Belussi, A
    Posenato, R
    WEB ENGINEERING, PROCEEDINGS, 2004, 3140 : 478 - 482
  • [32] Formal Verification of Data-Intensive Applications through Model Checking Modulo Theories
    Bersani, Marcello M.
    Marconi, Francesco
    Rossi, Matteo
    Erascu, Madalina
    Ghilardi, Silvio
    SPIN'17: PROCEEDINGS OF THE 24TH ACM SIGSOFT INTERNATIONAL SPIN SYMPOSIUM ON MODEL CHECKING OF SOFTWARE, 2017, : 98 - 101
  • [33] Managing the evolution of data-intensive Web applications by model-driven techniques
    Antonio Cicchetti
    Davide Di Ruscio
    Ludovico Iovino
    Alfonso Pierantonio
    Software & Systems Modeling, 2013, 12 : 53 - 83
  • [34] A Framework for Data Partitioning for C++ Data-Intensive Applications
    A. Milidonis
    G. Dimitroulakos
    M. D. Galanis
    A. P. Kakarountas
    G. Theodoridis
    C. Goutis
    F. Catthoor
    Design Automation for Embedded Systems, 2004, 9 : 101 - 121
  • [35] Model-driven Engineering IDE for Quality Assessment of Data-intensive Applications
    Gil, Marc
    Joubert, Christophe
    Torres, Ismael
    ICPE'17: COMPANION OF THE 2017 ACM/SPEC INTERNATIONAL CONFERENCE ON PERFORMANCE ENGINEERING, 2017, : 173 - 174
  • [36] Managing the evolution of data-intensive Web applications by model-driven techniques
    Cicchetti, Antonio
    Di Ruscio, Davide
    Iovino, Ludovico
    Pierantonio, Alfonso
    SOFTWARE AND SYSTEMS MODELING, 2013, 12 (01): : 53 - 83
  • [37] Data Structures for Data-Intensive Applications: Tradeoffs and Design Guidelines
    Athanassoulis, Manos
    Idreos, Stratos
    Shasha, Dennis
    FOUNDATIONS AND TRENDS IN DATABASES, 2023, 13 (1-2): : 1 - 168
  • [38] CoLoc: Distributed Data and Container Colocation for Data-Intensive Applications
    Renner, Thomas
    Thamsen, Lauritz
    Kao, Odej
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 3008 - 3015
  • [39] Sensor Data Analytics: Challenges and Methods for Data-Intensive Applications
    Ortega, Felipe
    Cano, Emilio L.
    ENTROPY, 2022, 24 (07)
  • [40] M3AT: Monitoring Agents Assignment Model for Data-Intensive Applications
    Kashansky, Vladislav
    Kimovski, Dragi
    Prodan, Radu
    Agrawalt, Prateek
    Marozzo, Fabrizio
    Iuhaszl, Gabriel
    Justyna, Marek
    Garcia-Blas, Javier
    2020 28TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP 2020), 2020, : 72 - 79