Dataset bias: A case study for visual question answering

被引:0
作者
Das A. [1 ]
Anjum S. [1 ]
Gurari D. [1 ]
机构
[1] School of Information, University of Texas at Austin
来源
Proceedings of the Association for Information Science and Technology | 2019年 / 56卷 / 01期
基金
美国国家科学基金会;
关键词
Assistive Technologies; Bias in Machine Learning; Ethics of Artificial Intelligence; Visual Question Answering;
D O I
10.1002/pra2.7
中图分类号
学科分类号
摘要
We examine the issue of bias in datasets designed to train visual question answering (VQA) algorithms. These datasets include a collection of natural language questions about images (aka - visual questions). We consider three popular datasets that are captured by people with sight, people who are blind, and generated by computers. We first demonstrate that machine learning algorithms can be trained to recognize each dataset's bias, and so determine the source of a novel visual question. We then discuss potential risks and benefits of biased VQA datasets and corresponding machine learning algorithms that can identify the source of a visual question; e.g., whether it comes from a person with sight, a person who is blind, or bot (aka - computer). Our ultimate aim is to inspire the development of more inclusive VQA systems. Author(s) retain copyright, but ASIS&T receives an exclusive publication license
引用
收藏
页码:58 / 67
页数:9
相关论文
共 39 条
[1]  
Antol S., Agrawal A., Lu J., Mitchell M., Batra D., Zitnick C.L., Parikh D., pp. 2425-2433, (2015)
[2]  
Bellamy R.K., Dey K., Hind M., Hoffman S.C., Houde S., Kannan K., Zhang Y., Ai fairness 360: An extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias, ArXiv, (2018)
[3]  
Bigham J.P., Jayant C., Ji H., Little G., Miller A., Miller R.C., Yeh T., VizWiz: Nearly real-time answers to visual questions, (2010)
[4]  
Binns R., Fairness in machine learning: Lessons from political philosophy, ArXiv, (2017)
[5]  
Brady E., Morris M.R., Zhong Y., White S., Bigham J.P., pp. 2117-2126, (2013)
[6]  
Braithwaite D.O., “Just how much did that wheelchair cost?”: Management of privacy boundaries by persons with disabilities, Western Journal of Communication (Includes Communication Reports), 55, 3, pp. 254-274, (1991)
[7]  
Buolamwini J., Gebru T., pp. 77-91, (2018)
[8]  
Chouldechova A., Fair prediction with disparate impact: A study of bias in recidivism prediction instruments, Big Data, 5, 2, pp. 153-163, (2017)
[9]  
Deng J., Dong W., Socher R., Li L.-J., Li K., Fei-Fei L., pp. 248-255, (2009)
[10]  
Dillon M., Introduction to modern information retrieval, (1983)