Documenting Computer Vision Datasets: An Invitation to Reflexive Data Practices

被引:33
作者
Miceli, Milagros [1 ]
Yang, Tianling [1 ]
Naudts, Laurens [2 ]
Schuessler, Martin [1 ]
Serbanescu, Diana [1 ]
Hanna, Alex [3 ]
机构
[1] Tech Univ Berlin, Berlin, Germany
[2] Katholieke Univ Leuven, Ctr IT & IP Law CiTiP, Leuven, Belgium
[3] Google Res, Mountain View, CA USA
来源
PROCEEDINGS OF THE 2021 ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, FACCT 2021 | 2021年
关键词
datasheets for datasets; dataset documentation; reflexivity; data annotation; training data; transparency; accountability; audits; machine learning;
D O I
10.1145/3442188.3445880
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In industrial computer vision, discretionary decisions surrounding the production of image training data remain widely undocumented. Recent research taking issue with such opacity has proposed standardized processes for dataset documentation. In this paper, we expand this space of inquiry through fieldwork at two data processing companies and thirty interviews with data workers and computer vision practitioners. We identify four key issues that hinder the documentation of image datasets and the effective retrieval of production contexts. Finally, we propose reflexivity, understood as a collective consideration of social and intellectual factors that lead to praxis, as a necessary precondition for documentation. Reflexive documentation can help to expose the contexts, relations, routines, and power structures that shape data.
引用
收藏
页码:161 / 172
页数:12
相关论文
共 49 条
[1]  
Alade Yewande, 2019, Towards Better Classification, P4
[2]  
Alexander M., 2012, NEW JIM CROW MASS IN
[3]   FactSheets: Increasing trust in AI services through supplier's declarations of conformity [J].
Arnold, M. ;
Bellamy, R. K. E. ;
Hind, M. ;
Houde, S. ;
Mehta, S. ;
Mojsilovic, A. ;
Nair, R. ;
Ramamurthy, K. Natesan ;
Olteanu, A. ;
Piorkowski, D. ;
Reimer, D. ;
Richards, J. ;
Tsay, J. ;
Varshney, K. R. .
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2019, 63 (4-5)
[4]  
Bender E. M., 2018, Transactions of the Association for Computational Linguistics, V6, P587, DOI [10.1162/tacla00041, DOI 10.1162/TACLA00041]
[5]  
Berendt Bettina, 2019, Paladyn, Journal of Behavioral Robotics, V10, P44, DOI 10.1515/pjbr-2019-0004
[6]  
Bonilla -Silva Eduardo, 2014, Racism without racists: Color-blind racism and the persistence of racial inequality in the United States, V4th
[7]  
Bourdieu P., 2000, PASCALIAN MEDITATION
[8]  
Bourdieu P., 1992, INVITATION REFLEXIVE
[9]  
Bourke B, 2014, QUAL REP, V19
[10]  
Charmaz K., 2006, Constructing grounded theory: A practical guide through qualitative analysis