Supporting Shared Resource Usage for a Diverse User Community: the OSG Experience and Lessons Learned

被引:3
作者
Garzoglio, Gabriele [1 ]
Levshina, Tanya [1 ]
Rynge, Mats
Sehgal, Chander [1 ]
Slyz, Marko [1 ]
机构
[1] Fermilab Natl Accelerator Lab, Comp Sect, POB 500, Batavia, IL 60510 USA
来源
INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS 2012 (CHEP2012), PTS 1-6 | 2012年 / 396卷
关键词
D O I
10.1088/1742-6596/396/3/032046
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
The Open Science Grid (OSG) supports a diverse community of new and existing users in adopting and making effective use of the Distributed High Throughput Computing (DHTC) model. The LHC user community has deep local support within the experiments. For other smaller communities and individual users the OSG provides consulting and technical services through the User Support area. We describe these sometimes successful and sometimes not so successful experiences and analyze lessons learned that are helping us improve our services. The services offered include forums to enable shared learning and mutual support, tutorials and documentation for new technology, and troubleshooting of problematic or systemic failure modes. For new communities and users, we bootstrap their use of the distributed high throughput computing technologies and resources available on the OSG by following a phased approach. We first adapt the application and run a small production campaign on a subset of "friendly" sites. Only then do we move the user to run full production campaigns across the many remote sites on the OSG, adding to the community resources up to hundreds of thousands of CPU hours per day. This scaling up generates new challenges like no determinism in the time to job completion, and diverse errors due to the heterogeneity of the configurations and environments so some attention is needed to get good results. We cover recent experiences with image simulation for the Large Synoptic Survey Telescope (LSST), small-file large volume data movement for the Dark Energy Survey (DES), civil engineering simulation with the Network for Earthquake Engineering Simulation (NEES), and accelerator modeling with the Electron Ion Collider group at BNL. We will categorize and analyze the use cases and describe how our processes are evolving based on lessons learned.
引用
收藏
页数:11
相关论文
共 9 条
[1]  
[Anonymous], IRODS DAT GRIDS DIG
[2]  
Barbosa A., RUNNING OPENSEES PRO
[3]  
Bird I., SMR JOINT FUNCTIONAL
[4]  
Buncic P., 2010, J PHYS C SERIES, V219, P2009
[5]  
Chen W., 2011, 9 INT C PAR PROC APP
[6]  
Hoche S., SLACPUB14859
[7]  
Ivezic Z, LSST SCI DRIVERS REF
[8]  
Sfiligoi Igor, 2008, Journal of Physics: Conference Series, V119, DOI 10.1088/1742-6596/119/6/062044
[9]  
Toll Tobias, ELECT ION COLLIDER S