Prototype of fault adaptive embedded software for large-scale real-time systems

被引:2
|
作者
Messie, Derek [1 ]
Jung, Mina [1 ]
Oh, Jae C. [1 ]
Shetty, Shweta [2 ]
Nordstrom, Steven [2 ]
Haney, Michael [3 ]
机构
[1] Syracuse Univ, Dept Elect Engn & Comp Sci, Syracuse, NY 13244 USA
[2] Vanderbilt Univ, Inst Software Integrated Syst, Nashville, TN 37235 USA
[3] Univ Illinois, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
large-scale real-time systems; embedded systems; subsumption architecture; multi-agent systems;
D O I
10.1007/s10462-007-9028-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a comprehensive prototype of large-scale fault adaptive embedded software developed for the proposed Fermilab BTeV high energy physics experiment. Lightweight self-optimizing agents embedded within Level 1 of the prototype are responsible for proactive and reactive monitoring and mitigation based on specified layers of competence. The agents are self-protecting, detecting cascading failures using a distributed approach. Adaptive, reconfigurable, and mobile objects for reliablility are designed to be self-configuring to adapt automatically to dynamically changing environments. These objects provide a self-healing layer with the ability to discover, diagnose, and react to discontinuities in real-time processing. A generic modeling environment was developed to facilitate design and implementation of hardware resource specifications, application data flow, and failure mitigation strategies. Level 1 of the planned BTeV trigger system alone will consist of 2500 DSPs, so the number of components and intractable fault scenarios involved make it impossible to design an 'expert system' that applies traditional centralized mitigative strategies based on rules capturing every possible system state. Instead, a distributed reactive approach is implemented using the tools and methodologies developed by the Real-Time Embedded Systems group.
引用
收藏
页码:299 / 312
页数:14
相关论文
共 50 条
  • [21] Software components services for embedded real-time systems
    Luders, Frank
    Flemstrom, Daniel
    Wall, Anders
    5TH WORKING IEEE/IFIP CONFERENCE ON SOFTWARE ARCHITECTURE, PROCEEDINGS, 2006, : 278 - +
  • [22] Heterogeneous architecture and testbed for simulation of large-scale real-time systems
    Mathuré, MA
    Jonnalagadda, V
    Zalewski, J
    SEVENTH IEEE INTERNATIONAL SYMPOSIUM ON DISTRIBUTED SIMULATION AND REAL-TIME APPLICATIONS, PROCEEDINGS, 2003, : 37 - 42
  • [23] Formal development of remote interfaces for large-scale real-time systems
    Hussak, W
    Yang, SH
    2004 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN & CYBERNETICS, VOLS 1-7, 2004, : 124 - 129
  • [24] REAL-TIME STATE ESTIMATION FOR LARGE-SCALE POWER-SYSTEMS
    KURZYN, MS
    IEEE TRANSACTIONS ON POWER APPARATUS AND SYSTEMS, 1983, 102 (07): : 2055 - 2063
  • [25] MEDOC - A METHODOLOGY FOR DESIGNING AND EVALUATING LARGE-SCALE REAL-TIME SYSTEMS
    LEMER, E
    AFIPS CONFERENCE PROCEEDINGS, 1982, 51 : 263 - &
  • [26] Model-based engineering of large-scale real-time systems
    Bapty, TA
    Sztipanovits, J
    INTERNATIONAL CONFERENCE AND WORKSHOP ON ENGINEERING OF COMPUTER-BASED SYSTEMS, PROCEEDINGS, 1997, : 467 - 474
  • [27] Real-time simulation of large-scale floods
    Liu, Q.
    Qin, Y.
    Li, G. D.
    Liu, Z.
    Cheng, D. J.
    Zhao, Y. H.
    INTERNATIONAL CONFERENCE ON WATER RESOURCE AND ENVIRONMENT 2016 (WRE2016), 2016, 39
  • [28] Verifying autonomic fault mitigation strategies in large scale real-time systems
    Dubey, A
    Nordstrom, S
    Keskinpala, T
    Neema, S
    Bapty, T
    THIRD IEEE INTERNATIONAL WORKSHOP ON ENGINEERING OF AUTONOMIC & AUTONOMOUS SYSTEMS (EASE 2006), PROCEEDINGS, 2006, : 127 - +
  • [29] A FRAMEWORK FOR SOFTWARE FAULT TOLERANCE IN REAL-TIME SYSTEMS
    ANDERSON, T
    KNIGHT, JC
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1983, 9 (03) : 355 - 364
  • [30] LogFlash: Real-time Streaming Anomaly Detection and Diagnosis from System Logs for Large-scale Software Systems
    Jia, Tong
    Wu, Yifan
    Hou, Chuanjia
    Li, Ying
    2021 IEEE 32ND INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE 2021), 2021, : 80 - 90