Matchmaking: Distributed resource management for high throughput computing

被引:133
|
作者
Raman, R [1 ]
Livny, M [1 ]
Solomon, M [1 ]
机构
[1] Univ Wisconsin, Madison, WI 53703 USA
关键词
D O I
10.1109/HPDC.1998.709966
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Conventional resource management systems use a system model to describe resources and a centralized scheduler to control their allocation. We argue that this paradigm does not adapt well to distributed systems, particularly those built to support high-throughput computing. Obstacles include heterogeneity of resources, which make uniform allocation algorithms difficult to formulate, and distributed ownership, leading to widely varying allocation policies. Faced with these problems, we developed and implemented the classified advertisement (classad) matchmaking framework, a flexible and general approach to resource management in distributed environment with decentralized ownership of resources. Novel aspects of the framework include a semi-structured data model that combines schema, data, and query in a simple but powerful specification language, and a clean separation of the matching and claiming phases of resource allocation. The representation and protocols result in a robust, scalable and flexible framework that can evolve with changing resources. The framework was designed to solve real problems encountered in the deployment of Condor a high throughput computing system developed at the University of Wisconsin-Madison. Condor is heavily used by scientists at numerous sites around the world. It derives much of its robustness and efficiency from the matchmaking architecture.
引用
收藏
页码:140 / 146
页数:3
相关论文
共 50 条
  • [21] Distributed Resource Management for Licensed and Unlicensed Integrated Mobile Edge Computing
    Lu, Xiao
    Yin, Rui
    Chen, Chao
    Che, Xianfu
    Wu, Celimuge
    2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,
  • [22] Scalable resource management system for high productive computing
    Lu, Yutong
    Xiao, Nong
    Yang, Xuejun
    PROCEEDINGS OF THE THIRD CHINAGRID ANNUAL CONFERENCE, 2008, : 331 - 337
  • [23] AggieGrid: from idle PCs to a distributed High-Throughput Computing system
    Trecakov, Strahinja
    Von Wolff, Nicholas
    PRACTICE AND EXPERIENCE IN ADVANCED RESEARCH COMPUTING 2024, PEARC 2024, 2024,
  • [24] Wrangling distributed computing for high- throughput environmental science: An introduction to HTCondor
    Erickson, Richard A.
    Fienen, Michael N.
    McCalla, S. Grace
    Weiser, Emily L.
    Bower, Melvin L.
    Knudson, Jonathan M.
    Thain, Greg
    PLOS COMPUTATIONAL BIOLOGY, 2018, 14 (10)
  • [25] Computing Resource Allocation for Heterogeneous Coded Distributed Computing
    Dai, Mingjun
    Yuan, Jialong
    Tong, Yanli
    Wang, Lan
    Lin, Xiaohui
    2022 31ST WIRELESS AND OPTICAL COMMUNICATIONS CONFERENCE (WOCC), 2022, : 18 - 23
  • [26] High Throughput Mutational Scanning of a Protein via Alchemistry on a High-Performance Computing Resource
    Guclu, Tandac F.
    Tayhan, Busra
    Cetin, Ebru
    Atilgan, Ali Rana
    Atilgan, Canan
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2025, 37 (03):
  • [28] GridSim: a toolkit for the modeling and simulation of distributed resource management and scheduling for Grid computing
    Buyya, R
    Murshed, M
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2002, 14 (13-15): : 1175 - 1220
  • [29] Dynamic power resource management for distributed computing on wirelessly connected handheld devices
    Moghal, MR
    Mian, MS
    Mirza, MS
    Mirza, MW
    International Conference on Computing, Communications and Control Technologies, Vol 5, Proceedings, 2004, : 340 - 345
  • [30] DMRM: Distributed Market-Based Resource Management of Edge Computing Systems
    Katsaragakis, Manolis
    Masouros, Dimosthenis
    Tsoutsouras, Vasileios
    Samie, Farzad
    Bauer, Lars
    Henkel, Joerg
    Soudris, Dimitrios
    2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 1391 - 1396