共 16 条
[1]
Online Algorithm-Based Fault Tolerance for Cholesky Decomposition on Heterogeneous Systems with GPUs
[J].
2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2016),
2016,
:993-1002
[2]
Chen Z., 2013, PPOPP
[4]
Guo L., 2016, ACMIEEE INT C HIGH P
[6]
MATCH: An MPI Fault Tolerance Benchmark Suite
[J].
2020 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION (IISWC 2020),
2020,
:60-71
[7]
MOARD: Modeling Application Resilience to Transient Faults on Data Objects
[J].
2019 IEEE 33RD INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2019),
2019,
:878-889
[8]
Guo LZ, 2018, PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE, AND ANALYSIS (SC'18), DOI 10.1109/SC.2018.00011
[9]
Rethinking Algorithm-Based Fault Tolerance with a Cooperative Software-Hardware Approach
[J].
2013 INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC),
2013,