Whirlwind: Overload protection, fault-tolerance and self-tuning in an Internet services platform

被引:0
作者
Donald, Peter [1 ]
Singh, Samar [1 ]
Ghosh, Somnath [1 ]
机构
[1] La Trobe Univ, Sch Comp Sci & Elect Engn, Melbourne, Vic, Australia
来源
2009 IEEE 9TH MALAYSIA INTERNATIONAL CONFERENCE ON COMMUNICATIONS (MICC) | 2009年
关键词
concurrency; software architecture; overload; faulttolerance; self-tuning; MANAGEMENT;
D O I
10.1109/MICC.2009.5431539
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Performance and availability are of critical importance when Internet services are integrated into emergency response management. Poor performance or service failure can result in severe economic, social or environmental cost. This paper presents Whirlwind, a software architecture that includes primitives for overload management and fault tolerance. A Whirlwind service is composed of a collection of isolated, independent, sequential processes that communicate through asynchronous message passing. If a process fails, the fault is contained within the process and a message is propagated to monitoring processes that may attempt to recover from the error. Processes are grouped with other processes that share similar resource, computation and concurrency requirements. Each group contains a scheduler and a thread pool that drives execution of processes within the group. The group may also define a message predicate that determines if a message posted to a process in the group is accepted. A rejected message typically signals overload and allows the application the chance to perform load shedding and avoid overcommitment of resources. Principals are shared between processes in different groups, enabling consistent prioritization and admission control across groups. The resource management policies are typically driven by feedback loops that monitor resource availability and system performance, and adjust tuning parameters to meet performance goals. Whirlwind evolved over a period of five fire seasons as part of emergency response software in Victoria, Australia.
引用
收藏
页码:397 / 402
页数:6
相关论文
共 26 条
  • [1] Web content adaptation to improve server overload behavior
    Abdelzaher, TF
    Bhatti, N
    [J]. COMPUTER NETWORKS, 1999, 31 (11-16) : 1563 - 1577
  • [2] Adya A, 2002, USENIX ASSOCIATION PROCEEDINGS OF THE GENERAL TRACK, P289
  • [3] [Anonymous], DOBBS J
  • [4] ARMSTRONG J., 1996, INAP 96, P16
  • [5] Armstrong J., 2002, CONCURRENCY ORIENTED
  • [6] Armstrong Joe L., 2003, MAKING RELIABLE DIST
  • [7] BHOJ SSP, 2000, HPL200061
  • [8] Cherkasova L, 1998, LECT NOTES COMPUT SC, V1401, P305, DOI 10.1007/BFb0037157
  • [9] Session-based admission control: A mechanism for peak load management of commercial web sites
    Cherkasova, L
    Phaal, P
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2002, 51 (06) : 669 - 685
  • [10] CROVELLA ME, 1999, USITS 99, P22