Shapley-based explainable AI for clustering applications in fault diagnosis and prognosis
被引:8
作者:
论文数: 引用数:
h-index:
机构:
Cohen, Joseph
[1
]
Huan, Xun
论文数: 0引用数: 0
h-index: 0
机构:
Univ Michigan, Dept Mech Engn, 2350 Hayward St, Ann Arbor, MI 48109 USAUniv Michigan, Michigan Inst Data & AI Soc, 500 Church St, Ann Arbor, MI 48109 USA
Huan, Xun
[2
]
Ni, Jun
论文数: 0引用数: 0
h-index: 0
机构:
Univ Michigan, Dept Mech Engn, 2350 Hayward St, Ann Arbor, MI 48109 USAUniv Michigan, Michigan Inst Data & AI Soc, 500 Church St, Ann Arbor, MI 48109 USA
Ni, Jun
[2
]
机构:
[1] Univ Michigan, Michigan Inst Data & AI Soc, 500 Church St, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Dept Mech Engn, 2350 Hayward St, Ann Arbor, MI 48109 USA
Shapley value analysis;
Explainable artificial intelligence;
Clustering;
Prognostics and health management;
ARTIFICIAL-INTELLIGENCE;
D O I:
10.1007/s10845-024-02468-2
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
Data-driven artificial intelligence models require explainability in intelligent manufacturing to streamline adoption and trust in modern industry. However, recently developed explainable artificial intelligence (XAI) techniques that estimate feature contributions on a model-agnostic level such as SHapley Additive exPlanations (SHAP) have not yet been evaluated for semi-supervised fault diagnosis and prognosis problems characterized by class imbalance and weakly labeled datasets. This paper explores the potential of utilizing Shapley values for a new clustering framework compatible with semi-supervised learning problems, loosening the strict supervision requirement of current XAI techniques. This broad methodology is validated on two case studies: a heatmap image dataset obtained from a semiconductor manufacturing process featuring class imbalance, and the benchmark N-CMAPSS dataset. Semi-supervised clustering based on Shapley values significantly improves upon clustering quality compared to the fully unsupervised case, deriving information-dense and meaningful clusters that relate to underlying fault diagnosis model predictions. These clusters can also be characterized by high-precision decision rules in terms of original feature values, as demonstrated in the second case study. The rules, limited to 2 terms utilizing original feature scales, describe 14 out of the 19 derived equipment failure clusters with average precision exceeding 0.85, showcasing the promising utility of the explainable clustering framework for intelligent manufacturing applications.