A User-Centered Approach to Gamify the Manual Creation of Training Data for Machine Learning

被引：2

作者：

Alaghbari S. ^{[1
]}

Mitschick A. ^{[2
]}

Blichmann G. ^{[1
]}

Voigt M. ^{[1
]}

Dachselt R. ^{[2
,3
]}

机构：

[1] AI4BD Deutschland GmbH, Dresden

[2] Technische Universität Dresden, Dresden

[3] Centre for Tactile Internet with Human-in-the-Loop (CeTI), Technische Universität Dresden, Dresden

来源：

i-com | 2021年 / 20卷 / 01期

关键词：

gamification; machine learning; object labeling; training data;

D O I：

10.1515/icom-2020-0030

中图分类号：

学科分类号：

摘要：

The development of artificial intelligence, e. g. for Computer Vision, through supervised learning requires the input of large amounts of annotated or labeled data objects as training data. Usually, the creation of high-quality training data is done manually which can be repetitive and tiring. Gamification, the use of game elements in a non-game context, is one method to make such tedious tasks more interesting. We propose a multi-step process for gamifying the manual creation of training data for machine learning purposes. In this article, we give an overview of related concepts and existing implementations and present a user-centered approach for a real-life use case. Based on a survey within the target user group we identified annotation use cases and dominant player characteristics. The results served as a foundation for designing the gamification concepts which were then discussed with the participants. The final concept includes levels of increasing difficulty, tutorials, progress indicators and a narrative built around a robot character which at the same time is a user assistant. The implemented prototype is an extension of an existing annotation tool at an AI product company and serves as a basis for further observations. © 2021 Walter de Gruyter GmbH, Berlin/Boston 2021.

引用

页码：33 / 48

页数：15

共 50 条

[41] Attesting Distributional Properties of Training Data for Machine Learning
Duddu, Vasisht
Das, Anudeep
Khayata, Nora
Yalame, Hossein
Schneider, Thomas
Asokan, N.
COMPUTER SECURITY-ESORICS 2024, PT I, 2024, 14982 : 3 - 23
[42] Training Data Debugging for the Fairness of Machine Learning Software
Li, Yanhui
Meng, Linghan
Chen, Lin
Yu, Li
Wu, Di
Zhou, Yuming
Xu, Baowen
2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), 2022, : 2215 - 2227
[43] Privacy Risk Assessment of Training Data in Machine Learning
Bai, Yang
Fan, Mingyu
Li, Yu
Xie, Chuangmin
IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2022), 2022, : 1010 - 1015
[44] Supervised machine learning using encrypted training data
Francisco-Javier González-Serrano
Adrián Amor-Martín
Jorge Casamayón-Antón
International Journal of Information Security, 2018, 17 : 365 - 377
[45] Fast Training Data Generation for Machine Learning Analysis of Cosmic Ray Showers
Hachaj, Tomasz
Bibrzycki, Lukasz
Piekarczyk, Marcin
IEEE ACCESS, 2023, 11 : 7410 - 7419
[46] Imbalanced generative sampling of training data for improving quality of machine learning model
Coskun, Umut Can
Dogan, Kemal Mert
Gunpinar, Erkan
ADVANCED ENGINEERING INFORMATICS, 2024, 62
[47] DIGITAL DATA FORGETTING: A Machine Learning Approach
Gunay, Melike
Yildiz, Eyyup
Nalcakan, Yagiz
Asiroglu, Batuhan
Zencirli, Ahmet
Mete, Busra Rumeysa
Ensari, Tolga
2018 2ND INTERNATIONAL SYMPOSIUM ON MULTIDISCIPLINARY STUDIES AND INNOVATIVE TECHNOLOGIES (ISMSIT), 2018, : 502 - 505
[48] A machine learning approach for IoT cultural data
Piccialli F.
Cuomo S.
Cola V.S.D.
Casolla G.
Journal of Ambient Intelligence and Humanized Computing, 2024, 15 (02) : 1715 - 1726
[49] Perspectives on the Gamification of an Interactive Health Technology for Postoperative Rehabilitation of Pediatric Anterior Cruciate Ligament Reconstruction: User-Centered Design Approach
McClincy, Michael
Seabol, Liliana G.
Riffitts, Michelle
Ruh, Ethan
Novak, Natalie E.
Wasilko, Rachel
Hamm, Megan E.
Bell, Kevin M.
JMIR SERIOUS GAMES, 2021, 9 (03):
[50] Supervised Machine Learning mit Nutzergenerierten Inhalten: Oversampling für nicht balancierte TrainingsdatenSupervised machine learning with user generated content: oversampling for imbalanced training data
Anke Stoll
Publizistik, 2020, 65 (2) : 233 - 251

← 1 2 3 4 5 →