Successive Refinement in Large-Scale Computation: Expediting Model Inference Applications

被引:0
|
作者
Esfahanizadeh, Homa [1 ]
Cohen, Alejandro [2 ]
Shamai, Shlomo [2 ]
Medard, Muriel [3 ]
机构
[1] Nokia Bell Labs, Murray Hill, NJ 07974 USA
[2] Technion Israel Inst Technol, Elect & Comp Engn Dept, IL-3200003 Haifa, Israel
[3] MIT, Res Lab Elect RLE, Cambridge, MA 02139 USA
基金
美国国家科学基金会; 以色列科学基金会;
关键词
Computation; layered resolution; adaptive; linear; nonlinear; machine learning; inference; matrix multiplication;
D O I
10.1109/TSP.2025.3537409
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Modern computationally-intensive applications often operate under time constraints, necessitating acceleration methods and distribution of computational workloads across multiple entities. However, the outcome is either achieved within the desired timeline or not, and in the latter case, valuable resources are wasted. In this paper, we introduce solutions for layered-resolution computation. These solutions allow lower-resolution results to be obtained at an earlier stage than the final result. This innovation notably enhances the deadline-based systems, as if a computational job is terminated due to time constraints, an approximate version of the final result can still be generated. Moreover, in certain operational regimes, a high-resolution result might be unnecessary, because the low-resolution result may already deviate significantly from the decision threshold, for example in AI-based decision-making systems. Therefore, operators can decide whether higher resolution is needed or not based on intermediate results, enabling computations with adaptive resolution. We present our framework for two critical and computationally demanding jobs: distributed matrix multiplication (linear) and model inference in machine learning (nonlinear). Our theoretical and empirical results demonstrate that the execution delay for the first resolution is significantly shorter than that for the final resolution, while maintaining overall complexity comparable to the conventional one-shot approach. Our experiments further illustrate how the layering feature increases the likelihood of meeting deadlines and enables adaptability and transparency in massive, large-scale computations.
引用
收藏
页码:811 / 826
页数:16
相关论文
共 50 条
  • [1] Large-Scale Electromagnetic Computation for Modeling and Applications
    Liu, Qing Huo
    Jiang, Lijun
    Chew, Weng Cho
    PROCEEDINGS OF THE IEEE, 2013, 101 (02) : 223 - 226
  • [2] Blockwise HMM computation for large-scale population genomic inference
    Paul, Joshua S.
    Song, Yun S.
    BIOINFORMATICS, 2012, 28 (15) : 2008 - 2015
  • [3] Spanning Edge Centrality: Large-scale Computation and Applications
    Mavroforakis, Charalampos
    Garcia-Lebron, Richard
    Koutis, Ioannis
    Terzi, Evimaria
    PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW 2015), 2015, : 732 - 742
  • [4] Protein homology model refinement by large-scale energy optimization
    Park, Hahnbeom
    Ovchinnikov, Sergey
    Kim, David E.
    DiMaio, Frank
    Baker, David
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2018, 115 (12) : 3054 - 3059
  • [5] PROMISES OF LARGE-SCALE COMPUTATION
    BUZBEE, BL
    RAVECHE, HJ
    JOURNAL OF RESEARCH OF THE NATIONAL BUREAU OF STANDARDS, 1985, 90 (01): : 49 - 52
  • [6] EFFICIENT COMPUTATION WITH A LINEAR MIXED MODEL ON LARGE-SCALE DATA SETS WITH APPLICATIONS TO GENETIC STUDIES
    Pirinen, Matti
    Donnelly, Peter
    Spencer, Chris C. A.
    ANNALS OF APPLIED STATISTICS, 2013, 7 (01): : 369 - 390
  • [7] LARGE-SCALE MULTIPLE INFERENCE OF COLLECTIVE DEPENDENCE WITH APPLICATIONS TO PROTEIN FUNCTION
    Jernigan, Robert
    Jia, Kejue
    Ren, Zhao
    Zhou, Wen
    ANNALS OF APPLIED STATISTICS, 2021, 15 (02): : 902 - 924
  • [8] LARGE-SCALE NETWORK ANALYSIS WITH APPLICATIONS TO TRANSPORTATION, COMMUNICATION AND INFERENCE NETWORKS
    TEH, HH
    FOO, MF
    DISCRETE MATHEMATICS, 1988, 72 (1-3) : 347 - 353
  • [9] Theory of large-scale matrix computation and applications in electronic structure calculation
    Fujiwara, T.
    Hoshi, T.
    Yamamoto, S.
    JOURNAL OF PHYSICS-CONDENSED MATTER, 2008, 20 (29)
  • [10] LARGE-SCALE INFERENCE WITH BLOCK STRUCTURE
    Kou, Jiyao
    Walther, Guenther
    ANNALS OF STATISTICS, 2022, 50 (03): : 1541 - 1572