Fault tolerance deadlock detection/resolution algorithm for the AND-OR model

被引:0
作者
Cheng, Xin [1 ,2 ]
Liu, Hongwei [1 ]
Dong, Jian [1 ]
Yang, Xiaozong [1 ]
机构
[1] School of Computer Science and Technology, Harbin Institute of Technology
[2] Shanghai Fire-Fighting Research Institute, Ministry of Public Security
来源
Jisuanji Yanjiu yu Fazhan/Computer Research and Development | 2007年 / 44卷 / 05期
关键词
AND-OR model; Deadlock detection; Distributed system; Fault tolerance; Generalized model;
D O I
10.1360/crad20070510
中图分类号
学科分类号
摘要
Distributed system provides an effective method to build high performance computing system with relative low cost. During the running of distributed computing, deadlock would happen if the resource allocation and requirement confliction occur. All the processes are waiting each other for the holding resources releasing, and then the system running stops. In an unreliable distributed system, failures may prevent deadlock detection algorithms from properly detecting deadlocks. Few of the algorithms proposed in the literatures address the issue of handing these failures. In this paper, three types of failures are identified and a fault tolerance deadlock detection and resolution algorithm is proposed. Failures are treated as different detection termination conditions in the algorithm. The algorithm is based on a generalized AND-OR model and employs diffusion computation and centralized reduction methods to detect deadlocks. The algorithm distinguishes cycles and knots and gives all members of a cycle. The upper limit of the time and space complexity is d and e+n-1 respectively if the deadlock topology is a static cycle, where e is the number of the edge and n and d are the number of the nodes in the wait-for graph. The performance of the proposed algorithm is equal to or better than that of the current algorithms.
引用
收藏
页码:798 / 805
页数:7
相关论文
共 20 条
[1]  
Yang F., Liu X., Zuo C., Research on transparency and task scheduling of a distributed parallel server, Journal of Computer Research and Development, 40, 9, pp. 1319-1325, (2003)
[2]  
Knapp E., Deadlock detection in distributed databases, ACM Computing Surveys, 19, 4, pp. 303-328, (1987)
[3]  
Chandy K.M., Misra J., A distributed algorithm for detecting resource deadlocks in distributed systems, ACM SIGACTSIGOPS Symp Principles of Distributed Computing, (1982)
[4]  
Mitchell D.P., Merritt M.J., A distributed algorithm for deadlock detection and resolution, ACM Conf on Principles of Distributed Computing, (1984)
[5]  
Gonzalez J.R., A distributed deadlock resolution algorithm for the AND model, IEEE Trans on Parallel and Distributed Systems, 10, 5, pp. 433-447, (1999)
[6]  
Choudhary A.N., Kohler W.H., Stankovic J.A., A modified priority based probe algorithm for distributed deadlock detection and resolution, IEEE Trans on Software Engineering, 15, 1, pp. 10-17, (1989)
[7]  
Lee S., A fast algorithm for detecting distributed deadlocks in the OR request model, The 15th Int'l Parallel and Distributed Processing Symposium, (2001)
[8]  
Lee S., Performance analysis of distributed deadlock detection algorithms, IEEE Trans on Knowledge and Data Engineering, 13, 4, pp. 623-636, (2003)
[9]  
Manivannan D., Singhal M., A distributed algorithm for knot detection in a distributed graph, Int'l Conf on Parallel Processing, (2002)
[10]  
Souza G.P., Pfitscher G.H., An implementation of a distributed algorithm for detection of local knots and cycles in directed graphs based on the CSP model and Java, The 6th IEEE Int'l Workshop on Distributed Simulation and Real-Time Applications, (2002)