Citizens' data afterlives: Practices of dataset inclusion in machine learning for public welfare

被引:1
|
作者
Ratner, Helene Friis [1 ,2 ]
Thylstrup, Nanna Bonde [2 ]
机构
[1] Aarhus Univ, Danish Sch Educ DPU, Tuborgvej 164, DK-2400 Copenhagen N, Denmark
[2] Univ Copenhagen, Dept Arts & Cultural Studies, Karen Blixensvej 1, DK-2300 Copenhagen, Denmark
关键词
Machine learning; Welfare state; Data afterlives; Dataset negotiations; DATABASES; CHILD; CARE;
D O I
10.1007/s00146-024-01920-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Public sector adoption of AI techniques in welfare systems recasts historic national data as resource for machine learning. In this paper, we examine how the use of register data for development of predictive models produces new 'afterlives' for citizen data. First, we document a Danish research project's practical efforts to develop an algorithmic decision-support model for social workers to classify children's risk of maltreatment. Second, we outline the tensions emerging from project members' negotiations about which datasets to include. Third, we identify three types of afterlives for citizen data in machine learning projects: (1) data afterlives for training and testing the algorithm, acting as 'ground truth' for inferring futures, (2) data afterlives for validating the algorithmic model, acting as markers of robustness, and (3) data afterlives for improving the model's fairness, valuated for reasons of data ethics. We conclude by discussing how, on one hand, these afterlives engender new ethical relations between state and citizens; and how they, on the other hand, also articulate an alternative view on the value of datasets, posing interesting contrasts between machine learning projects developed within the context of the Danish welfare state and mainstream corporate AI discourses of the bigger, the better.
引用
收藏
页码:1183 / 1193
页数:11
相关论文
共 50 条
  • [1] Public debt and welfare with machine learning
    Zhu, Jingjing
    Huang, Tianyuan
    FINANCE RESEARCH LETTERS, 2024, 69
  • [2] Runtime Data Layout Scheduling for Machine Learning Dataset
    You, Yang
    Demmel, James
    2017 46TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2017, : 452 - 461
  • [3] A machine learning dataset for FRB detection in raw data
    Xu, ZhiJun
    An, Tao
    Guo, ShaoGuang
    Lao, BaoQiang
    Lv, WeiJia
    Wu, XiaoCong
    SCIENTIA SINICA-PHYSICA MECHANICA & ASTRONOMICA, 2023, 53 (02)
  • [4] Reintroducing KAPD as a Dataset for Machine Learning and Data Mining Applications
    Seddiq, Yasser
    Meftah, Ali
    Alghamdi, Mansour
    Alotaibi, Yousef
    UKSIM-AMSS 10TH EUROPEAN MODELLING SYMPOSIUM ON COMPUTER MODELLING AND SIMULATION (EMS), 2016, : 70 - 74
  • [5] Machine Learning for Neurodegenerative Disorder - Diagnosis Survey of Practices and Launch of Benchmark Dataset
    Tagaris, Athanasios
    Kollias, Dimitrios
    Stafylopatis, Andreas
    Tagaris, Georgios
    Kollias, Stefanos
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2018, 27 (03)
  • [6] The social construction of datasets: On the practices, processes, and challenges of dataset creation for machine learning
    Orr, Will
    Crawford, Kate
    NEW MEDIA & SOCIETY, 2024, 26 (09) : 4955 - 4972
  • [7] FOWD: A Free Ocean Wave Dataset for Data Mining and Machine Learning
    Hafner, Dion
    Gemmrich, Johannes
    Jochum, Markus
    JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY, 2021, 38 (07) : 1305 - 1322
  • [8] Exploration of Machine Learning and Data Mining techniques on a horse racing dataset
    Kyriacou, E
    Toolan, F
    Dunnion, J
    MLMTA '05: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MACHINE LEARNING MODELS TECHNOLOGIES AND APPLICATIONS, 2005, : 161 - 166
  • [9] Machine learning applied to emerald gemstone grading: framework proposal and creation of a public dataset
    F. B. Pena
    D. Crabi
    S. C. Izidoro
    É. O. Rodrigues
    G. Bernardes
    Pattern Analysis and Applications, 2022, 25 : 241 - 251
  • [10] Machine learning applied to emerald gemstone grading: framework proposal and creation of a public dataset
    Pena, F. B.
    Crabi, D.
    Izidoro, S. C.
    Rodrigues, E. O.
    Bernardes, G.
    PATTERN ANALYSIS AND APPLICATIONS, 2022, 25 (01) : 241 - 251