共 55 条
[33]
Exploiting Asynchrony from Exact Forward Recovery for DUE in Iterative Solvers
[J].
PROCEEDINGS OF SC15: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS,
2015,
[35]
Lee J, 2009, DES AUT CON, P47
[37]
Relaxed Replication for Energy Efficient and Resilient GPU Computing
[J].
PROCEEDINGS OF WORKSHOP ON FAULT TOLERANCE FOR HPC AT EXTREME SCALE (FTXS 2021),
2021,
:41-50
[38]
Energy Analysis and Optimization for Resilient Scalable Linear Systems
[J].
2018 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER),
2018,
:24-34
[39]
Mills Bryan., 2013, Proceedings of the 1st International Workshop on Energy Efficient Supercomputing, page, P6
[40]
VeloC: Towards High Performance Adaptive Asynchronous Checkpointing at Large Scale
[J].
2019 IEEE 33RD INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2019),
2019,
:911-920