Automated Directed Fairness Testing

被引:112
作者
Udeshi, Sakshi [1 ]
Arora, Pryanshu [2 ]
Chattopadhyay, Sudipta [1 ]
机构
[1] Singapore Univ Tech & Design, Singapore, Singapore
[2] BITS Pilani, Pilani, Rajasthan, India
来源
PROCEEDINGS OF THE 2018 33RD IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMTED SOFTWARE ENGINEERING (ASE' 18) | 2018年
关键词
Software Fairness; Directed Testing; Machine Learning;
D O I
10.1145/3238147.3238165
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Fairness is a critical trait in decision making. As machine-learning models are increasingly being used in sensitive application domains (e.g. education and employment) for decision making, it is crucial that the decisions computed by such models are free of unintended bias. But how can we automatically validate the fairness of arbitrary machine-learning models? For a given machine-learning model and a set of sensitive input parameters, our AEQVITAS approach automatically discovers discriminatory inputs that highlight fairness violation. At the core of AEQVITAS are three novel strategies to employ probabilistic search over the input space with the objective of uncovering fairness violation. Our AEQVITAS approach leverages inherent robustness property in common machine-learning models to design and implement scalable test generation methodologies. An appealing feature of our generated test inputs is that they can be systematically added to the training set of the underlying model and improve its fairness. To this end, we design a fully automated module that guarantees to improve the fairness of the model. We implemented AEQVITAS and we have evaluated it on six state-of-the-art classifiers. Our subjects also include a classifier that was designed with fairness in mind. We show that AEQVITAS effectively generates inputs to uncover fairness violation in all the subject classifiers and systematically improves the fairness of respective models using the generated test inputs. In our evaluation, AEQVITAS generates up to 70% discriminatory inputs (w.r.t. the total number of inputs generated) and leverages these inputs to improve the fairness up to 94%.
引用
收藏
页码:98 / 108
页数:11
相关论文
共 19 条
  • [1] [Anonymous], 2017, ARXIV170808559
  • [2] [Anonymous], 2008, KDD
  • [3] Dua D, 2017, UCI MACHINE LEARNING, DOI DOI 10.1016/J.DSS.2009.05.016
  • [4] Dwork C., 2012, P 3 INN THEOR COMP S, P214, DOI DOI 10.1145/2090236.2090255
  • [5] Analysis of classifiers' robustness to adversarial perturbations
    Fawzi, Alhussein
    Fawzi, Omar
    Frossard, Pascal
    [J]. MACHINE LEARNING, 2018, 107 (03) : 481 - 508
  • [6] Certifying and Removing Disparate Impact
    Feldman, Michael
    Friedler, Sorelle A.
    Moeller, John
    Scheidegger, Carlos
    Venkatasubramanian, Suresh
    [J]. KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 259 - 268
  • [7] Fairness Testing: Testing Software for Discrimination
    Galhotra, Sainyam
    Brun, Yuriy
    Meliou, Alexandra
    [J]. ESEC/FSE 2017: PROCEEDINGS OF THE 2017 11TH JOINT MEETING ON FOUNDATIONS OF SOFTWARE ENGINEERING, 2017, : 498 - 510
  • [8] Goh G, 2016, ADV NEUR IN, V29
  • [9] Adversarial Examples for Malware Detection
    Grosse, Kathrin
    Papernot, Nicolas
    Manoharan, Praveen
    Backes, Michael
    McDaniel, Patrick
    [J]. COMPUTER SECURITY - ESORICS 2017, PT II, 2017, 10493 : 62 - 79
  • [10] The Unreasonable Effectiveness of Data
    Halevy, Alon
    Norvig, Peter
    Pereira, Fernando
    [J]. IEEE INTELLIGENT SYSTEMS, 2009, 24 (02) : 8 - 12