Reliability Assurance for AI Systems

被引:4
作者
Blood, Jonathan C. [1 ]
Herbert, Nathan W. [1 ]
Wayne, Martin R. [1 ]
机构
[1] US Army Combat Capabil Dev Command DEVCOM Anal Ct, Attn FCDD DAS L, 6896 Mauchly St, Aberdeen Proving Ground, MD 21005 USA
来源
2023 ANNUAL RELIABILITY AND MAINTAINABILITY SYMPOSIUM, RAMS | 2023年
关键词
AI reliability;
D O I
10.1109/RAMS51473.2023.10088197
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Many applications of artificial intelligence (AI) / assistive automation are in the Army's pipeline of developmental technologies and systems. Ensuring reliability in these systems will require rethinking current approaches. A greater focus on systems' human and environmental interactions is needed. These interactions can range from traditional human factors issues such as ease of use, to more complex situations involving AI decision making, feedback loops, and additional human cognitive loads. Other factors such as unintended uses of AI, adversarial attacks, and the capacity of AI systems to evolve and adapt must also be considered when examining reliability. A review of failure modes for AI systems reveals common threads that deserve additional emphasis in reliability engineering activities: Data pipelines Human-system interactions New adversarial attack modes Traditional reliability engineering tools, such as Failure Modes and Effects Analysis, Reliability Block Diagrams, Highly Accelerated Life Testing, and Physics of Failure Analysis, remain very much relevant and deserve to be reemphasized [1]. They also need to be expanded and supplemented with additional tools to best meet the reliability challenges posed by AI systems. One such tool is the System-Theoretic Process Analysis (STPA), which can be introduced early in concept development and approaches hazard/risk analysis holistically, including through examining human-system interactions, training, and documentation [2]. One of the main benefits of AI systems is their ability to be flexible and adaptable to new conditions. However, changes in environment, the data pipeline, and user behavior can lead to unreliability if the way the system adapts is not skillfully managed. Regular, proactive assessments of system health after deployment-similar to preventative maintenance checks and services (PMCS)-that consider these failure sources can help ensure reliability throughout the system's useful life.
引用
收藏
页数:6
相关论文
共 16 条
  • [1] [Anonymous], 2016, 16332016 IEEE
  • [2] Bernreuther D, 2013, 59TH ANNUAL RELIABILITY AND MAINTAINABILITY SYMPOSIUM (RAMS)
  • [3] Brady SP, 2019, P REL MAINT S
  • [4] Dhinakaran A., 2021, MODELS SHIPPED WHAT
  • [5] FAILS: a tool for assessing risk in ML systems
    Dominguez, Gonzalo Aguirre
    Kawaai, Keigo
    Maruyama, Hiroshi
    [J]. 2021 28TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE WORKSHOPS (APSECW 2021), 2021, : 1 - 4
  • [6] Feng Alice., 2019, MYTH IMPARTIAL MACHI
  • [7] Kumar R., 2019, Failure Modes in Machine Learning Systems
  • [8] Leveson N. G., 2018, STPA Handbook
  • [9] Liu J., 2021, ROBUST INTELLIGENCE
  • [10] Pohland T., 2014, DEFENSE AT L MAG JAN