Beyond model interpretability: socio-structural explanations in machine learning

被引:0
作者
Smart, Andrew [1 ]
Kasirzadeh, Atoosa [2 ]
机构
[1] Google Res, San Francisco, CA 94105 USA
[2] Univ Edinburgh, Edinburgh, Scotland
关键词
Machine learning; Interpretability; Explainability; Social structures; Social structural explanations; Responsible AI; RACIAL BIAS; HEALTH;
D O I
10.1007/s00146-024-02056-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
What is it to interpret the outputs of an opaque machine learning model? One approach is to develop interpretable machine learning techniques. These techniques aim to show how machine learning models function by providing either model-centric local or global explanations, which can be based on mechanistic interpretations (revealing the inner working mechanisms of models) or non-mechanistic approximations (showing input feature-output data relationships). In this paper, we draw on social philosophy to argue that interpreting machine learning outputs in certain normatively salient domains could require appealing to a third type of explanation that we call "socio-structural" explanation. The relevance of this explanation type is motivated by the fact that machine learning models are not isolated entities but are embedded within and shaped by social structures. Socio-structural explanations aim to illustrate how social structures contribute to and partially explain the outputs of machine learning models. We demonstrate the importance of socio-structural explanations by examining a racially biased healthcare allocation algorithm. Our proposal highlights the need for transparency beyond model interpretability: understanding the outputs of machine learning systems could require a broader analysis that extends beyond the understanding of the machine learning model itself.
引用
收藏
页码:2045 / 2053
页数:9
相关论文
共 69 条
  • [1] Adebayo J, 2018, ADV NEUR IN, V31
  • [2] Agarwal C, 2022, Arxiv, DOI arXiv:2206.11104
  • [3] Alvarez-Melis D, 2018, Arxiv, DOI arXiv:1806.08049
  • [4] Visualizing the effects of predictor variables in black box supervised learning models
    Apley, Daniel W.
    Zhu, Jingyu
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2020, 82 (04) : 1059 - 1086
  • [5] Barocas S., 2019, Fairness and Machine Learning
  • [6] Bereska L, 2024, Arxiv, DOI [arXiv:2404.14082, 10.48550/arXiv.2404.14082]
  • [7] Explainable Machine Learning in Deployment
    Bhatt, Umang
    Xiang, Alice
    Sharma, Shubham
    Weller, Adrian
    Taly, Ankur
    Jia, Yunhan
    Ghosh, Joydeep
    Puri, Ruchir
    Moura, Jose M. F.
    Eckersley, Peter
    [J]. FAT* '20: PROCEEDINGS OF THE 2020 CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, 2020, : 648 - 657
  • [8] Structural Vulnerability: Operationalizing the Concept to Address Health Disparities in Clinical Care
    Bourgois, Philippe
    Holmes, Seth M.
    Sue, Kim
    Quesada, James
    [J]. ACADEMIC MEDICINE, 2017, 92 (03) : 299 - 307
  • [9] Bricken T., 2023, Transformer Circuits Thread
  • [10] Carter S., 2019, Distill, V4, P15, DOI DOI 10.23915/DISTILL.00015