共 40 条
[1]
[Anonymous], INT C DEP SYST NETW
[2]
[Anonymous], 2012, PROC IEEE INT C HIGH
[3]
Bairavasundaram L.N., 2008, Characteristics, Impact, and Tolerance of Partial Disk Failures
[4]
Reducing Waste in Extreme Scale Systems through Introspective Analysis
[J].
2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2016),
2016,
:212-221
[6]
A higher order estimate of the optimum checkpoint interval for restart dumps
[J].
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF GRID COMPUTING THEORY METHODS AND APPLICATIONS,
2006, 22 (03)
:303-312
[7]
Di Martino, 2014, 44 INT C DEP SYST NE
[8]
Measuring and Understanding Extreme-Scale Application Resilience: A Field Study of 5,000,000 HPC Application Runs
[J].
2015 45TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS,
2015,
:25-36
[9]
LOGAIDER: A tool for mining potential correlations of HPC log events
[J].
2017 17TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID),
2017,
:442-451
[10]
El-Sayed Nosayba, 2013, READING LINES FAILUR