Analyzing the impact of missing values and selection bias on fairness

被引:20
作者
Wang, Yanchen [1 ]
Singh, Lisa [1 ]
机构
[1] Georgetown Univ, Washington, DC 20057 USA
关键词
Machine learning fairness; Missing data; Data bias; Selection bias;
D O I
10.1007/s41060-021-00259-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Algorithmic decision making is becoming more prevalent, increasingly impacting people's daily lives. Recently, discussions have been emerging about the fairness of decisions made by machines. Researchers have proposed different approaches for improving the fairness of these algorithms. While these approaches can help machines make fairer decisions, they have been developed and validated on fairly clean data sets. Unfortunately, most real-world data have complexities that make them more dirty. This work considers two of these complexities by analyzing the impact of two real-world data issues on fairness-missing values and selection bias-for categorical data. After formulating this problem and showing its existence, we propose fixing algorithms for data sets containing missing values and/or selection bias that use different forms of reweighting and resampling based upon the missing value generation process. We conduct an extensive empirical evaluation on both real-world and synthetic data using various fairness metrics, and demonstrate how different missing values generated from different mechanisms and selection bias impact prediction fairness, even when prediction accuracy remains fairly constant.
引用
收藏
页码:101 / 119
页数:19
相关论文
共 54 条
[1]   Working with missing values [J].
Acock, AC .
JOURNAL OF MARRIAGE AND FAMILY, 2005, 67 (04) :1012-1028
[2]  
Allison P., 2001, Missing data, V136
[3]  
American Bar Association, 2019, DISP IMP CLAIMS AD
[4]  
[Anonymous], 2017, Algorithms in the criminal justice system: Assessing the use of risk assessments in sentencing
[5]   Bias on the Web [J].
Baeza-Yates, Ricardo .
COMMUNICATIONS OF THE ACM, 2018, 61 (06) :54-61
[6]  
Bennett DA, 2001, AUST NZ J PUBL HEAL, V25, P464, DOI 10.1111/j.1467-842X.2001.tb00294.x
[7]  
Biddle D, 2005, Adverse impact and test validation: A practitioner's guide to valid and defensible employment testing
[8]  
Binns R., 2018, P MACHINE LEARNING R, P149, DOI DOI 10.1145/3178876.3186150
[9]  
Bolukbasi T, 2016, ADV NEUR IN, V29
[10]  
Bronner L, 2020, FIVETHIRTYEIGHT 0625