ORES: Lowering Barriers with Participatory Machine Learning in Wikipedia

被引:13
作者
Halfaker A. [1 ]
Geiger R.S. [2 ]
机构
[1] Microsoft, 1 Microsoft Way, Redmond, 98052, WA
[2] Department of Communication, Halicioǧlu Data Science Institute, University of California, San Diego, 9500 Gilman Drive, San Diego, 92093, CA
关键词
algorithms; fairness; governance; machine learning; reflection; transparency; wikipedia;
D O I
10.1145/3415219
中图分类号
学科分类号
摘要
Algorithmic systems-from rule-based bots to machine learning classifiers-have a long history of supporting the essential work of content moderation and other curation work in peer production projects. From counter-vandalism to task routing, basic machine prediction has allowed open knowledge projects like Wikipedia to scale to the largest encyclopedia in the world, while maintaining quality and consistency. However, conversations about how quality control should work and what role algorithms should play have generally been led by the expert engineers who have the skills and resources to develop and modify these complex algorithmic systems. In this paper, we describe ORES: an algorithmic scoring service that supports real-time scoring of wiki edits using multiple independent classifiers trained on different datasets. ORES decouples several activities that have typically all been performed by engineers: choosing or curating training data, building models to serve predictions, auditing predictions, and developing interfaces or automated agents that act on those predictions. This meta-algorithmic system was designed to open up socio-technical conversations about algorithms in Wikipedia to a broader set of participants. In this paper, we discuss the theoretical mechanisms of social change ORES enables and detail case studies in participatory machine learning around ORES from the 5 years since its deployment. © 2020 Owner/Author.
引用
收藏
相关论文
共 113 条
[1]  
Adler B.T., De Alfaro L., Mola-Velasco S.M., Rosso P., West A.G., Wikipedia vandalism detection: Combining natural language, metadata, and reputation features, International Conference on Intelligent Text Processing and Computational Linguistics, pp. 277-288, (2011)
[2]  
Adler P., Falk C., Friedler S.A., Nix T., Rybeck G., Scheidegger C., Smith B., Venkatasubramanian S., Auditing Black-box Models for Indirect Influence., 54, 1, pp. 95-122, (2018)
[3]  
Alvarado O., Waern A., Towards algorithmic experience: Initial efforts for social media contexts, Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, 286, (2018)
[4]  
Angwin J., Larson J., Mattu S., Kirchner L., Machine bias, ProPublica, 23, (2016)
[5]  
Barabas C., Beard A., Dryer T., Semel B., Solomun S., Abolish the #TechToPrison-Pipeline, (2020)
[6]  
Barocas S., Hood S., Ziewitz M., Governing algorithms: A provocation piece, SSRN. Governing Algorithms Conference., (2013)
[7]  
Benjamin R., Race after Technology: Abolitionist Tools for the New Jim Code, (2019)
[8]  
Bentley R., Hughes J.A., Randall D., Rodden T., Sawyer P., Shapiro D., Sommerville I., Ethnographically-informed systems design for air traffic control, Proceedings of the 1992 ACM Conference on Computer-Supported Cooperative Work (CSCW '92). Association for Computing Machinery, New York, NY, USA, pp. 123-129, (1992)
[9]  
Berk R., Heidari H., Jabbari S., Kearns M., Roth A., Fairness in Criminal Justice Risk Assessments: The State of the Art., 2018, (2018)
[10]  
Bianchi S.M., Milkie M.A., Work and family research in the first decade of the 21st century, Journal of Marriage and Family, 72, 3, pp. 705-725, (2010)