What Did My AI Learn? How Data Scientists Make Sense of Model Behavior

被引:19
|
作者
Cabrera, Angel Alexander [1 ]
Ribeiro, Marco Tulio [2 ]
Lee, Bongshin [2 ]
Deline, Robert [2 ]
Perer, Adam [1 ]
Drucker, Steven M. [2 ]
机构
[1] Carnegie Mellon Univ, 5000 Forbes Ave, Pittsburgh, PA 15213 USA
[2] Microsoft Res, Microsoft Bldg 99,14820 NE 36th St, Redmond, WA 98052 USA
基金
美国国家科学基金会;
关键词
Machine learning; AI; machine behavior; machine learning testing; sensemaking; visualization;
D O I
10.1145/3542921
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Data scientists require rich mental models of how AI systems behave to effectively train, debug, and work with them. Despite the prevalence of AI analysis tools, there is no general theory describing how people make sense of what their models have learned. We frame this process as a form of sensemaking and derive a framework describing how data scientists develop mental models of AI behavior. To evaluate the framework, we show how existing AI analysis tools fit into this sensemaking process and use it to design AIFinnity, a system for analyzing image-and-text models. Lastly, we explored how data scientists use a tool developed with the framework through a think-aloud study with 10 data scientists tasked with using AIFinnity to pick an image captioning model. We found that AIFinnity's sensemaking workflow reflected participants' mental processes and enabled them to discover and validate diverse AI behaviors.
引用
收藏
页数:27
相关论文
共 11 条
  • [1] The Dutch Scaler Performance Indicator: How Much Did My Model Actually Learn?
    van de Bijl, Etienne Pieter
    Klein, Jan Gerard
    Pries, Joris
    Bhulai, Sandjai
    van der Mei, Robert Douwe
    JOURNAL OF CLASSIFICATION, 2025,
  • [2] How Did Corona Crisis Managers in Germany Make Sense of the Psychosocial Situation?
    Nils Lüttschwager
    Daniela Stelzmann
    Lars Gerhold
    Sebastian Sterl
    European Journal for Security Research, 2022, 7 (2) : 163 - 189
  • [4] How Do Teachers Make Sense of Data in the Context of High-Stakes Decision Making?
    Vanlommel, Kristin
    Schildkamp, Kim
    AMERICAN EDUCATIONAL RESEARCH JOURNAL, 2019, 56 (03) : 792 - 821
  • [5] How and What Can Humans Learn from Being in the Loop?Invoking Contradiction Learning as a Measure to Make Humans Smarter
    Benjamin M. Abdel-Karim
    Nicolas Pfeuffer
    Gernot Rohde
    Oliver Hinz
    KI - Künstliche Intelligenz, 2020, 34 : 199 - 207
  • [6] How and What Can Humans Learn from Being in the Loop? Invoking Contradiction Learning as a Measure to Make Humans Smarter
    Abdel-Karim, Benjamin M.
    Pfeuffer, Nicolas
    Rohde, Gernot
    Hinz, Oliver
    KUNSTLICHE INTELLIGENZ, 2020, 34 (02): : 199 - 207
  • [7] How Do People Develop Folk Theories of Generative AI Text-to-Image Models? A Qualitative Study on How People Strive to Explain and Make Sense of GenAI
    Di Lodovico, Chiara
    Torrielli, Federico
    Di Caro, Luigi
    Rapp, Amon
    INTERNATIONAL JOURNAL OF HUMAN-COMPUTER INTERACTION, 2025,
  • [8] Data analytics in pediatric cardiac intensive care: How and what can we learn to improve care
    Goldsmith, Michael P.
    Schwartz, Emily J.
    Hehir, David A.
    PROGRESS IN PEDIATRIC CARDIOLOGY, 2020, 59
  • [9] "What Did You Say, ChatGPT?" The Use of AI in Black Women's HIV Self-Education: An Inductive Qualitative Data Analysis
    Chandler, Rasheeta D.
    Warner, Sheena
    Aidoo-Frimpong, Gloria
    Wells, Jessica
    JANAC-JOURNAL OF THE ASSOCIATION OF NURSES IN AIDS CARE, 2024, 35 (03): : 294 - 302
  • [10] 'And what did you do about my emotions during Covid-19?' Making sense of negative emotions at work through institutional logics and Critical Systems Heuristics
    Tavella, Elena
    SYSTEMS RESEARCH AND BEHAVIORAL SCIENCE, 2023, 40 (06) : 836 - 852