GridX1: A Canadian computational grid

被引:16
作者
Agarwal, A.
Ahmed, M.
Berman, A.
Caron, B. L.
Charbonneau, A.
Deatrich, D.
Desmarais, R.
Dimopoulos, A.
Gable, I.
Groer, L. S.
Haria, R.
Impey, R.
Klektau, L.
Lindsay, C.
Mateescu, G.
Matthews, Q.
Norton, A.
Podaima, W.
Quesnel, D.
Simmonds, R.
Sobie, R. J. [1 ]
Arnaud, B. St.
Usher, C.
Vanderster, D. C.
Vetterli, M.
Walker, R.
Yuen, M.
机构
[1] Univ Victoria, Dept Phys & Astron, Victoria, BC V8W 2Y2, Canada
[2] Univ Victoria, Dept Elect & Comp Engn, Victoria, BC V8W 2Y2, Canada
[3] Univ Victoria, Dept Phys & Astron, HEPnet, Victoria, BC V8W 2Y2, Canada
[4] CANARIE Inc, Ottawa, ON, Canada
[5] Natl Res Council Canada, Ottawa, ON K1A 0R6, Canada
[6] TRIUMF, Vancouver, BC V6T 2A3, Canada
[7] Univ Alberta, Dept Phys, Edmonton, AB T6G 2M7, Canada
[8] Univ Calgary, Dept Comp Sci, Calgary, AB T2N 1N4, Canada
[9] Simon Fraser Univ, Dept Phys, Burnaby, BC V5A 1S6, Canada
[10] Univ Toronto, Dept Phys, Toronto, ON M4X 1K9, Canada
来源
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF GRID COMPUTING THEORY METHODS AND APPLICATIONS | 2007年 / 23卷 / 05期
基金
加拿大自然科学与工程研究理事会;
关键词
grid computing; grid deployment; high energy physics applications;
D O I
10.1016/j.future.2006.12.006
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The present paper discusses the design and application of GridX1, a computational grid project which uses shared resources at several Canadian research institutions. The infrastructure of GridX1 is built using off-the-shelf Globus Toolkit 2 middleware, a MyProxy credential server, and a resource broker based on Condor-G to manage the distributed computing environment. The broker-based job scheduling and management functionality are exposed as a Globus GRAM job service. Resource brokering is based on the Condor matchmaking mechanism, whereby job and resource attributes are expressed as ClassAds, with the attributes Requirements and Rank being used to define respectively the constraints and preferences that the matched entity must meet. Various strategies for ranking resources are presented, including an Estimated-Waiting-Time (EWT) algorithm, a throttled load balancing strategy, and a novel external ranking strategy based on data location. One of the unique features is a mechanism which transparently presents the GridX1 resources as a single compute element to the LHC Computing Grid (LCG), based at the CERN Laboratory in Geneva. This interface was used during the ATLAS data challenge 2 to federate the Canadian resources into the LCG without the overhead of maintaining separate LCG sites. Further, the BaBar particle physics simulation has been adapted to execute on GridX1 and resulted in a simplified management of the production. The usage of the throttled EWT and load balancing strategies combined with external data ranking was found to be very effective in improving efficiency and reducing the job failure rate. (c) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:680 / 687
页数:8
相关论文
共 24 条
  • [1] The INFN-grid testbed
    Alfieri, R
    Barbera, R
    Belluomo, P
    Cavalli, A
    Cecchini, R
    Chierici, A
    Ciaschini, V
    Dell'Agnello, L
    Donno, F
    Ferro, E
    Forte, A
    Gaido, L
    Ghiselli, A
    Gianoli, A
    Italiano, A
    Lusso, S
    Luvisetto, M
    Mastroserio, P
    Mazzucato, M
    Mura, D
    Reale, M
    Salconi, L
    Sava, G
    Serra, M
    Spataro, F
    Taurino, F
    Tortone, G
    Vaccarossa, L
    Verlato, M
    Finzi, GV
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2005, 21 (02): : 249 - 258
  • [2] [Anonymous], 2459 RFC
  • [3] [Anonymous], 2001, P 10 INT S HIGH PERF
  • [4] ELLART M, 2007, FUTURE GENER COMP SY, V23, P219
  • [5] Globus: A metacomputing infrastructure toolkit
    Foster, I
    Kesselman, C
    [J]. INTERNATIONAL JOURNAL OF SUPERCOMPUTER APPLICATIONS AND HIGH PERFORMANCE COMPUTING, 1997, 11 (02): : 115 - 128
  • [6] Foster I., 2003, The Grid 2: Blueprint for a new computing infrastructure
  • [7] Foster I., 1998, P 5 ACM C COMP COMM, P83, DOI DOI 10.1145/288090.288111
  • [8] The UK e-science core programme and the grid
    Hey, T
    Trefethen, AE
    [J]. FUTURE GENERATION COMPUTER SYSTEMS, 2002, 18 (08) : 1017 - 1031
  • [9] The ganglia distributed monitoring system: design, implementation, and experience
    Massie, ML
    Chun, BN
    Culler, DE
    [J]. PARALLEL COMPUTING, 2004, 30 (07) : 817 - 840
  • [10] NEMAN HB, 2003, C COMP HIGH EN PHYS