Overcoming the pitfalls and perils of algorithms: A classification of machine learning biases and mitigation methods

被引:91
作者
van Giffen, Benjamin [1 ]
Herhausen, Dennis [2 ]
Fahse, Tobias [1 ]
机构
[1] Univ St Gallen, Inst Informat Management, St Gallen, Switzerland
[2] Vrije Univ Amsterdam, Sch Business & Econ, Amsterdam, Netherlands
关键词
Machine learning; Artificial intelligence; Bias; Mitigation methods; Case study; ARTIFICIAL-INTELLIGENCE; DISCRIMINATION; PREDICTION;
D O I
10.1016/j.jbusres.2022.01.076
中图分类号
F [经济];
学科分类号
02 ;
摘要
Over the last decade, the importance of machine learning increased dramatically in business and marketing. However, when machine learning is used for decision-making, bias rooted in unrepresentative datasets, inade-quate models, weak algorithm designs, or human stereotypes can lead to low performance and unfair decisions, resulting in financial, social, and reputational losses. This paper offers a systematic, interdisciplinary literature review of machine learning biases as well as methods to avoid and mitigate these biases. We identified eight distinct machine learning biases, summarized these biases in the cross-industry standard process for data mining to account for all phases of machine learning projects, and outline twenty-four mitigation methods. We further contextualize these biases in a real-world case study and illustrate adequate mitigation strategies. These insights synthesize the literature on machine learning biases in a concise manner and point to the importance of human judgment for machine learning algorithms.
引用
收藏
页码:93 / 106
页数:14
相关论文
共 65 条
[1]  
Angwin J., 2022, Ethics of data and analytics, P254
[2]  
Baer T., 2019, Understand, manage, and prevent algorithmic bias: A guide for business users and data scientists
[3]   Bias on the Web [J].
Baeza-Yates, Ricardo .
COMMUNICATIONS OF THE ACM, 2018, 61 (06) :54-61
[4]   Computing Ethics Engaging the Ethics of Data Science in Practice Seeking more common ground between data scientists and their critics [J].
Barocas, Solon ;
Boyd, Danah .
COMMUNICATIONS OF THE ACM, 2017, 60 (11) :23-25
[5]   Big Data's Disparate Impact [J].
Barocas, Solon ;
Selbst, Andrew D. .
CALIFORNIA LAW REVIEW, 2016, 104 (03) :671-732
[6]  
Bellamy RKE, 2018, ARXIV181001943
[7]   A principled approach for building and evaluating neural network classification models [J].
Berardi, VL ;
Patuwo, BE ;
Hu, MY .
DECISION SUPPORT SYSTEMS, 2004, 38 (02) :233-246
[8]  
Binder Alexander, 2016, International Conference on Information Science and Applications (ICISA) 2016. LNEE 376, P913, DOI 10.1007/978-981-10-0557-2_87
[9]  
Bogen M., 2018, Upturn
[10]  
Buolamwini J., 2018, Proceedings of Machine Learning Research, V81, P77