Challenges and Issues of the Integration of RADIC into Open MPI

被引:0
|
作者
Fialho, Leonardo [1 ]
Santos, Guna [1 ]
Duarte, Angelo [2 ]
Rexachs, Dolores [1 ]
Luque, Emilio [1 ]
机构
[1] Univ Autonoma Barcelona, Comp Architecture & Operating Syst Dept, E-08193 Barcelona, Spain
[2] Univ Estadual Feira de Santana, Dept Tecnol, Feira De Santana, Brazil
关键词
Fault Tolerance; High Availability; RADIC; Open MPI;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Parallel machines are growing in complexity and number of components which increases fault probability. Thus, MPI applications running on these machines may not reach completion. This paper presents RADIC/OMPI, which is the integration of RADIC fault tolerance architecture into Open MPT. RADIC/OMPI relies on uncoordinated checkpoints combined with pessimistic receiver-based message logs in a distributed way without the need to use any central or stable elements. Due to this, it assures the application completion automatically and transparently for users and administrators. We concluded that within certain applications RADIC/OMPI provides fault tolerance with an acceptable overhead even in the presence of consecutive faults.
引用
收藏
页码:73 / +
页数:2
相关论文
共 50 条
  • [31] Experimental settings in program comprehension: Challenges and open issues
    Di Lucca, Giuseppe A.
    Di Penta, Massimiliano
    14TH IEEE INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC 2006), PROCEEDINGS, 2006, : 229 - +
  • [32] Survey on Explainable AI: Techniques, challenges and open issues
    Abusitta, Adel
    Li, Miles Q.
    Fung, Benjamin C. M.
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
  • [33] Agile Project Management: Review, Challenges and Open Issues
    bin Ismail, Muhammad Fahmi
    Mansor, Zulkefli
    ADVANCED SCIENCE LETTERS, 2018, 24 (07) : 5216 - 5219
  • [34] Toward open-world software: Issues and challenges
    Baresi, Luciano
    Di Nitto, Ellsabetta
    Ghezzi, Carlo
    COMPUTER, 2006, 39 (10) : 36 - +
  • [35] Materialized View Maintenance: Issues, Classification, and Open Challenges
    Sebaa, Abderrazak
    Tari, Abdelkamel
    INTERNATIONAL JOURNAL OF COOPERATIVE INFORMATION SYSTEMS, 2019, 28 (01)
  • [36] Flying Social Networks: Architecture, Challenges and Open Issues
    Shi, Junling
    Zhao, Liang
    Wang, Xingwei
    Guizani, Mohsen
    Gaanin, Haris
    Lin, Na
    IEEE NETWORK, 2021, 35 (05): : 242 - 248
  • [37] Security in Nano Communication: Challenges and Open Research Issues
    Dressler, Falko
    Kargl, Frank
    2012 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2012, : 6183 - 6187
  • [38] On the Security and Privacy of Hyperledger Fabric: Challenges and Open Issues
    Brotsis, Sotirios
    Kolokotronis, Nicholas
    Limniotis, Konstantinos
    Bendiab, Gueltoum
    Shiaeles, Stavros
    2020 IEEE WORLD CONGRESS ON SERVICES (SERVICES), 2020, : 197 - 204
  • [39] Open MPI: A high-performance, heterogeneous MPI
    Graham, Richard L.
    Shipman, Galen M.
    Barrett, Brian W.
    Castain, Ralph H.
    Bosilca, George
    Lumsdaine, Andrew
    2006 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING, VOLS 1 AND 2, 2006, : 621 - +
  • [40] A review on Key Issues and Challenges in Integration of Distributed Generation System
    Gupta, Nikita
    Seethalekshmi, K.
    2018 5TH IEEE UTTAR PRADESH SECTION INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS AND COMPUTER ENGINEERING (UPCON), 2018, : 335 - 341