Argobots: A Lightweight Low-Level Threading and Tasking Framework

被引:73
作者
Seo, Sangmin [1 ]
Amer, Abdelhalim [1 ]
Balaji, Pavan [1 ]
Bordage, Cyril [2 ]
Bosilca, George [4 ]
Brooks, Alex [3 ]
Carns, Philip [1 ]
Castello, Adrian [6 ]
Genet, Damien [4 ]
Herault, Thomas [4 ]
Iwasaki, Shintaro [5 ]
Jindal, Prateek [3 ]
Kale, Laxmikant V. [3 ]
Krishnamoorthy, Sriram [7 ]
Lifflander, Jonathan [8 ]
Lu, Huiwei [9 ]
Meneses, Esteban [10 ,11 ]
Snir, Marc [3 ]
Sun, Yanhua [12 ]
Taura, Kenjiro [5 ]
Beckman, Pete [1 ]
机构
[1] Argonne Natl Lab, Lemont, IL 60439 USA
[2] Inria Bordeaux, F-33405 Talence, France
[3] Univ Illinois, Champaign, IL 61820 USA
[4] Univ Tennessee, Knoxville, TN 37996 USA
[5] Univ Tokyo, Bunkyo Ku, Tokyo 1138654, Japan
[6] Univ Jaume 1, Castellon De La Plana 12071, Castellon, Spain
[7] Pacific Northwest Natl Lab, Richland, WA 99354 USA
[8] Sandia Natl Labs, Livermore, CA 94551 USA
[9] Tencent, Shenzhen 518057, Peoples R China
[10] Costa Rica Natl High Technol Ctr, San Jose 10109, Costa Rica
[11] Costa Rica Inst Technol, Cartago 30101, Costa Rica
[12] Google, Mountain View, CA 94043 USA
关键词
Argobots; user-level thread; tasklet; OpenMP; MPI; I/O; interoperability; lightweight; context switch; stackable scheduler;
D O I
10.1109/TPDS.2017.2766062
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In the past few decades, a number of user-level threading and tasking models have been proposed in the literature to address the shortcomings of OS-level threads, primarily with respect to cost and flexibility. Current state-of-the-art user-level threading and tasking models, however, either are too specific to applications or architectures or are not as powerful or flexible. In this paper, we present Argobots, a lightweight, low-level threading and tasking framework that is designed as a portable and performant substrate for high-level programming models or runtime systems. Argobots offers a carefully designed execution model that balances generality of functionality with providing a rich set of controls to allow specialization by end users or high-level programming models. We describe the design, implementation, and performance characterization of Argobots and present integrations with three high-level models: OpenMP, MPI, and colocated I/O services. Evaluations show that (1) Argobots, while providing richer capabilities, is competitive with existing simpler generic threading runtimes; (2) our OpenMP runtime offers more efficient interoperability capabilities than production OpenMP runtimes do; (3) when MPI interoperates with Argobots instead of Pthreads, it enjoys reduced synchronization costs and better latency-hiding capabilities; and (4) I/O services with Argobots reduce interference with colocated applications while achieving performance competitive with that of a Pthreads approach.
引用
收藏
页码:512 / 526
页数:15
相关论文
共 56 条
[41]   OpenMC: A state-of-the-art Monte Carlo code for research and development [J].
Romano, Paul K. ;
Horelik, Nicholas E. ;
Herman, Bryan R. ;
Nelson, Adam G. ;
Forget, Benoit ;
Smith, Kord .
ANNALS OF NUCLEAR ENERGY, 2015, 82 :90-97
[42]  
Sangmin Seo, 2011, Proceedings 2011 International Conference on Parallel Architectures and Compilation Techniques (PACT), P253, DOI 10.1109/PACT.2011.57
[43]  
Schmager F., 2010, PLATEAU
[44]  
Shekhtman G., 2009, STATE THREADS LIB IN
[45]  
SunSoft, 1995, SOL MULT PROGR GUID
[46]   Performance evaluation of OpenMP applications with nested parallelism [J].
Tanaka, Y ;
Taural, K ;
Sato, M ;
Yonezawa, A .
LANGUAGES, COMPILERS, AND RUN-TIME SYSTEMS FOR SCALABLE COMPUTERS, 2000, 1915 :100-112
[47]   StackThreads/MP: Integrating futures into calling standards [J].
Taura, K ;
Tabata, K ;
Yonezawa, A .
ACM SIGPLAN NOTICES, 1999, 34 (08) :60-71
[48]   Fine-grain multithreading with minimal compiler support - A cost effective approach to implementing efficient multithreading languages [J].
Taura, K ;
Yonezawa, A .
ACM SIGPLAN NOTICES, 1997, 32 (05) :320-333
[49]  
THIBAULT S, 2005, P 2 INT WORKSH OP SY
[50]  
Tismer C., 2000, P 8 INT PYTH C, V1