IPSO: A Scaling Model for Data-Intensive Applications

被引:1
|
作者
Li, Zhongwei [1 ]
Duan, Feng [1 ]
Minh Nguyen [1 ]
Che, Hao [1 ]
Lei, Yu [1 ]
Jiang, Hong [1 ]
机构
[1] Univ Texas Arlington, Dept Comp Sci & Engn, Arlington, TX 76019 USA
关键词
scale-out workload; cloud computing; speedup; performance evaluation; Amdahl's Law; Gustafson's Law; AMDAHLS LAW;
D O I
10.1109/ICDCS.2019.00032
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Today's data center applications are predominantly data-intensive, calling for scaling out the workload to a large number of servers for parallel processing. Unfortunately, the existing scaling laws, notably, Amdahl's and Gustafson's laws are inadequate to characterize the scaling properties of dataintensive workloads. To fill this void, in this paper, we put forward a new scaling model, called In-Proportion and Scale-Out-induced scaling model (IPSO). IPSO generalizes the existing scaling models in two important aspects. First, it accounts for the possible in-proportion scaling, i.e., the scaling of the serial portion of the workload in proportion to the scaling of the parallelizable portion of the workload. Second, it takes into account the possible scaleout-induced scaling, i.e., the scaling of the collective overhead or workload induced by scaling out. IPSO exposes scaling properties of data-intensive workloads, rendering the existing scaling laws its special cases. In particular, IPSO reveals two new pathological scaling properties. Namely, the speedup may level off even in the case of the fixed-time workload underlying Gustafson's law, and it may peak and then fall as the system scales out. Extensive MapReduce and Spark-based case studies demonstrate that IPSO successfully captures diverse scaling properties of data-intensive applications. As a result, it can serve as a diagnostic tool to gain insights on or even uncover counter-intuitive root causes of observed scaling behaviors, especially pathological ones, for data-intensive applications. Finally, preliminary results also demonstrate the promising prospects of IPSO to facilitate effective resource provisioning to achieve the best speedup-versus-cost tradeoffs for data-intensive applications.
引用
收藏
页码:238 / 248
页数:11
相关论文
共 50 条
  • [1] Scaling Data-Intensive Applications on Heterogeneous Platforms with Accelerators
    Balevic, Ana
    Kienhuis, Bart
    2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, : 1866 - 1873
  • [2] Scaling eCGA Model Building via Data-Intensive Computing
    Verma, Abhishek
    Llora, Xavier
    Venkataraman, Shivaram
    Goldberg, David E.
    Campbell, Roy H.
    2010 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2010,
  • [3] Model and data engineering for advanced data-intensive systems and applications
    Yassine Ouhammou
    Ladjel Bellatreche
    Mirjana Ivanovic
    Alberto Abelló
    Computing, 2019, 101 : 1391 - 1395
  • [4] Model and data engineering for advanced data-intensive systems and applications
    Ouhammou, Yassine
    Bellatreche, Ladjel
    Ivanovic, Mirjana
    Abello, Alberto
    COMPUTING, 2019, 101 (10) : 1391 - 1395
  • [5] A new volunteer computing model for data-intensive applications
    Alonso-Monsalve, Saul
    Garcia-Carballeira, Felix
    Calderon, Alejandro
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (24):
  • [6] Model transformations in the development of data-intensive web applications
    Di Ruscio, D
    Pierantonio, A
    ADVANCED INFORMATION SYSTEMS ENGINEERING, PROCEEDINGS, 2005, 3520 : 475 - 490
  • [7] Applications in Data-Intensive Computing
    Shah, Anuj R.
    Adkins, Joshua N.
    Baxter, Douglas J.
    Cannon, William R.
    Chavarria-Miranda, Daniel G.
    Choudhury, Sutanay
    Gorton, Ian
    Gracio, Deborah K.
    Halter, Todd D.
    Jaitly, Navdeep D.
    Johnson, John R.
    Kouzes, Richard T.
    Macduff, Matthew C.
    Marquez, Andres
    Monroe, Matthew E.
    Oehmen, Christopher S.
    Pike, William A.
    Scherrer, Chad
    Villa, Oreste
    Webb-Robertson, Bobbie-Jo
    Whitney, Paul D.
    Zuljevic, Nino
    ADVANCES IN COMPUTERS, VOL 79, 2010, 79 : 1 - 70
  • [8] Metacomputing and data-intensive applications
    Messina, P
    WORLDWIDE COMPUTING AND ITS APPLICATIONS, 1997, 1274 : 226 - 236
  • [9] Data replication techniques for data-intensive applications
    No, Jaechun
    Park, Chang Won
    Park, Sung Soon
    COMPUTATIONAL SCIENCE - ICCS 2006, PT 4, PROCEEDINGS, 2006, 3994 : 1063 - 1070
  • [10] Analysis of Big Data for Data-Intensive Applications
    Dave, Meenu
    Gianey, Hemant Kumar
    2016 INTERNATIONAL CONFERENCE ON RECENT ADVANCES AND INNOVATIONS IN ENGINEERING (ICRAIE), 2016,