Model-based and actual independence for fairness-aware classification

被引:10
作者
Kamishima, Toshihiro [1 ]
Akaho, Shotaro [1 ]
Asoh, Hideki [1 ]
Sakuma, Jun [2 ,3 ]
机构
[1] Natl Inst Adv Ind Sci & Technol, AIST Tsukuba Cent 2,Umezono 1-1-1, Tsukuba, Ibaraki 3058568, Japan
[2] Univ Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 3058577, Japan
[3] RIKEN, Ctr Adv Intelligence Project, Chuo Ku, 1-4-1 Nihonbashi, Tokyo, Japan
关键词
Fairness; Discrimination; Classification; Cost-sensitive learning; DISCRIMINATION;
D O I
10.1007/s10618-017-0534-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The goal of fairness-aware classification is to categorize data while taking into account potential issues of fairness, discrimination, neutrality, and/or independence. For example, when applying data mining technologies to university admissions, admission criteria must be non-discriminatory and fair with regard to sensitive features, such as gender or race. In this context, such fairness can be formalized as statistical independence between classification results and sensitive features. The main purpose of this paper is to analyze this formal fairness in order to achieve better trade-offs between fairness and prediction accuracy, which is important for applying fairness-aware classifiers in practical use. We focus on a fairness-aware classifier, Calders and Verwer's two-naive-Bayes (CV2NB) method, which has been shown to be superior to other classifiers in terms of fairness. We hypothesize that this superiority is due to the difference in types of independence. That is, because CV2NB achieves actual independence, rather than satisfying model-based independence like the other classifiers, it can account for model bias and a deterministic decision rule. We empirically validate this hypothesis by modifying two fairness-aware classifiers, a prejudice remover method and a reject option-based classification (ROC) method, so as to satisfy actual independence. The fairness of these two modified methods was drastically improved, showing the importance of maintaining actual independence, rather than model-based independence. We additionally extend an approach adopted in the ROC method so as to make it applicable to classifiers other than those with generative models, such as SVMs.
引用
收藏
页码:258 / 286
页数:29
相关论文
共 27 条
  • [1] [Anonymous], 2013, INT C MACH LEARN
  • [2] [Anonymous], 2008, KDD
  • [3] Exploring discrimination: A user-centric evaluation of discrimination-aware data mining
    Berendt, Bettina
    Preibusch, Soeren
    [J]. 12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2012), 2012, : 344 - 351
  • [4] Bishop C.M., 2006, PATTERN RECOGN, V4, P738, DOI DOI 10.1117/1.2819119
  • [5] Controlling Attribute Effect in Linear Regression
    Calders, Toon
    Karim, Asim
    Kamiran, Faisal
    Ali, Wasif
    Zhang, Xiangliang
    [J]. 2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 71 - 80
  • [6] Three naive Bayes approaches for discrimination-free classification
    Calders, Toon
    Verwer, Sicco
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2010, 21 (02) : 277 - 292
  • [7] Dwork C., 2012, P 3 ITCS C, P214, DOI DOI 10.1145/2090236.2090255
  • [8] Elkan C., 2001, The Foundations of Cost-Sensitive Learning, P973
  • [9] Certifying and Removing Disparate Impact
    Feldman, Michael
    Friedler, Sorelle A.
    Moeller, John
    Scheidegger, Carlos
    Venkatasubramanian, Suresh
    [J]. KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 259 - 268
  • [10] Frank A., 2010, UCI Machine Learning Repository.