Algorithms for reliability-oriented module allocation in distributed computing systems

被引:28
作者
Tom, PA
Murthy, CSR [1 ]
机构
[1] Indian Inst Technol, Dept Comp Sci & Engn, Madras 600036, Tamil Nadu, India
[2] Motorola India Elect Ltd, Bangalore 560042, Karnataka, India
关键词
D O I
10.1016/S0164-1212(97)00005-8
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We consider the problem of finding an allocation of program modules onto processors of a distributed computing system such that the reliability of successfully executing these modules is maximized. The distributed system consists of a number of processors interconnected by means of communication links. Certain constraints such as storage and load limits may be present at each processor. At any point of time, each component of the distributed system (processor or communication link) can exist in either of two states-operational or failed. The probability of a component being operational is given. A program module can be executed on any one of a set of processors. For execution, it requires access to certain data files. If a particular file it requires is not available locally, it has to access the file remotely, and for the remote access to be possible, at least one path (sequence of links and processors) from the processor at which the program module is executing, to one of the processors where the required file is available, must be operational. To improve reliability, there may be multiple copies of certain files, dispersed at various processors. Our aim is to allocate the program modules to processors in a manner that maximizes the probability of it being able to successfully access ail the files it requires for execution, and the allocation should not violate any of the constraints. This problem is known to be NP-hard. We use a state space search technique-the A* algorithm to obtain an optimal allocation. We also present a heuristic algorithm which obtains sub-optimal allocations in a reasonable amount of computation time. Through simulations over a wide range of parameters, we demonstrate the effectiveness of our approach. (C) 1998 Elsevier Science Inc.
引用
收藏
页码:125 / 138
页数:14
相关论文
共 17 条
[1]  
AJITH TP, 1996, THESIS INDIAN I TECH
[2]   TASK ALLOCATION IN FAULT-TOLERANT DISTRIBUTED SYSTEMS [J].
BANNISTER, JA ;
TRIVEDI, KS .
ACTA INFORMATICA, 1983, 20 (03) :261-281
[3]   DUAL PROCESSOR SCHEDULING WITH DYNAMIC REASSIGNMENT [J].
BOKHARI, SH .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1979, 5 (04) :341-349
[4]   AN LC BRANCH-AND-BOUND ALGORITHM FOR THE MODULE ASSIGNMENT PROBLEM [J].
CHERN, MS ;
CHEN, GH ;
LIU, P .
INFORMATION PROCESSING LETTERS, 1989, 32 (02) :61-71
[5]  
EFE K, 1982, COMPUTER, V15, P50, DOI 10.1109/MC.1982.1654050
[6]  
HARIRI S, 1987, IEEE T COMPUT, V36, P1224, DOI 10.1109/TC.1987.1676862
[7]  
HARIRI S, 1986, P IEEE ACM 1986 FALL, P344
[8]   HEURISTIC ALGORITHMS FOR TASK ASSIGNMENT IN DISTRIBUTED SYSTEMS [J].
LO, VM .
IEEE TRANSACTIONS ON COMPUTERS, 1988, 37 (11) :1384-1397
[9]  
NILSSON NJ, 1977, PROBLEM SOLVING METH
[10]   OPTIMAL SCHEDULING OF COOPERATIVE TASKS IN A DISTRIBUTED SYSTEM USING AN ENUMERATIVE METHOD [J].
PENG, DT ;
SHIN, KG .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1993, 19 (03) :253-267