A Fault-tolerance Framework for Distributed Component Systems

被引:1
作者
Hamid, Brahim [1 ]
Radermacher, Ansgar [1 ]
Vanuxeem, Patrick [1 ]
Lanusse, Agnes [1 ]
Gerard, Sebastien [1 ]
机构
[1] CEA, LIST, Lab Ingn Dirigee Modeles Syst Embarques, F-91191 Gif Sur Yvette, France
来源
PROCEEDINGS OF THE 34TH EUROMICRO CONFERENCE ON SOFTWARE ENGINEERING AND ADVANCED APPLICATIONS | 2008年
关键词
Connector CORBA Component Model; Distributed applications; Failure detection; Fault tolerance; Middleware; Model-driven;
D O I
10.1109/SEAA.2008.50
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The requirement for higher reliability and availability of systems is continuously increasing even in domains not traditionally strongly concerned by such issues. Required solutions are expected to be efficient, flexible, reusable on rapidly evolving hardware and of course at low cost. Combining both model and component seems to be a very promising cocktail for building solutions to this problem. Hence, we will present in this paper an approach using a model as its first structural citizen all along the development process. Our proposal will be illustrated with an application modeled with UML (extended with some of its dedicated profiles). Our approach includes an underlying execution infrastructure/middleware, providing fault-tolerance services. For the component aspect, our framework promotes firstly an infrastructure based on the Component/Container/Connector paradigm to provide run-time facilities enabling transparent management of fault-tolerance (mainly fault-detection and redundancy mechanisms). For the model-driven point of view, our framework provides tool support for assisting the users to model their applications and to deploy and configure them on computing platforms. In this paper we focus on the run-time support offered by the component framework, specially the replication-aware interaction mechanism enabling a transparent replication management mechanisms and some additional system components dedicated to fault-detection and replicas management.
引用
收藏
页码:84 / 91
页数:8
相关论文
共 16 条
  • [1] Berthing J, 2007, EUROMICRO CONF PROC, P129
  • [2] Bunse C, 2007, EUROMICRO CONF PROC, P121
  • [3] Unreliable failure detectors for reliable distributed systems
    Chandra, TD
    Toueg, S
    [J]. JOURNAL OF THE ACM, 1996, 43 (02) : 225 - 267
  • [4] Dumitras T, 2005, LECT NOTES COMPUT SC, V3549, P212
  • [5] FRAGA J, 2003, 9 IEEE INT WORKSH OB
  • [6] HAMID B, 2008, WORKSH AD REC EMB SY
  • [7] HAMID B, 2007, THESIS U BORDEAUX 1
  • [8] LUNG LC, 2006, ISORC 2006, P504
  • [9] *OMG, 2006, FORMAL20060401 OMG
  • [10] *OMG, 2004, FORMAL20040312 OMG