Tailoring Data Source Distributions for Fairness-aware Data Integration

被引:16
|
作者
Nargesian, Fatemeh [1 ]
Asudeh, Abolfazl [2 ]
Jagadish, H., V [3 ]
机构
[1] Univ Rochester, Rochester, MN 55905 USA
[2] Univ Illinois, Chicago, IL USA
[3] Univ Michigan, Ann Arbor, MI 48109 USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2021年 / 14卷 / 11期
基金
美国国家科学基金会;
关键词
D O I
10.14778/3476249.3476299
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data scientists often develop data sets for analysis by drawing upon sources of data available to them. A major challenge is to ensure that the data set used for analysis has an appropriate representation of relevant (demographic) groups: it meets desired distribution requirements. Whether data is collected through some experiment or obtained from some data provider, the data from any single source may not meet the desired distribution requirements. Therefore, a union of data from multiple sources is often required. In this paper, we study how to acquire such data in the most cost effective manner, for typical cost functions observed in practice. We present an optimal solution for binary groups when the underlying distributions of data sources are known and all data sources have equal costs. For the generic case with unequal costs, we design an approximation algorithm that performs well in practice. When the underlying distributions are unknown, we develop an exploration-exploitation based strategy with a reward function that captures the cost and approximations of group distributions in each data source. Besides theoretical analysis, we conduct comprehensive experiments that confirm the effectiveness of our algorithms.
引用
收藏
页码:2519 / 2532
页数:14
相关论文
共 50 条
  • [41] Fairness-aware Methods in Rankings and Recommenders
    Pitoura, Evaggelia
    Stefanidis, Kostas
    Koutrika, Georgia
    2021 22ND IEEE INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2021), 2021, : 1 - 4
  • [42] On Convexity and Bounds of Fairness-aware Classification
    Wu, Yongkai
    Zhang, Lu
    Wu, Xintao
    WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, : 3356 - 3362
  • [43] FairGT: A Fairness-aware Graph Transformer
    Luo, Renqiang
    Huang, Huafei
    Yu, Shuo
    Zhang, Xiuzhen
    Xia, Feng
    arXiv,
  • [44] Fairness-aware recommendation with meta learning
    Oh, Hyeji
    Kim, Chulyun
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [45] Learning Fairness-Aware Relational Structures
    Zhang, Yue
    Ramesh, Arti
    ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2543 - 2550
  • [46] CHAMELEON: Foundation Models for Fairness-aware Multi-modal Data Augmentation to Enhance Coverage of Minorities
    Erfanian, Mahdi
    Jagadish, H. V.
    Asudeh, Abolfazl
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (11): : 3470 - 3483
  • [47] Evolutionary Multi-Objective Optimisation for Fairness-Aware Self Adjusting Memory Classifiers in Data Streams
    Amarasinghe, Pivithuru Thejan
    Diem Pham
    Binh Tran
    Su Nguyen
    Sun, Yuan
    Alahakoon, Damminda
    PROCEEDINGS OF THE 2024 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, GECCO 2024, 2024, : 258 - 266
  • [48] Can Fairness be Automated? Guidelines and Opportunities for Fairness-aware AutoML
    Weerts, Hilde
    Pfisterer, Florian
    Feurer, Matthias
    Eggensperger, Katharina
    Bergman, Edward
    Awad, Noor
    Vanschoren, Joaquin
    Pechenizkiy, Mykola
    Bischl, Bernd
    Hutter, Frank
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2023, 79 : 639 - 677
  • [49] Data-Driven Fairness-Aware Vehicle Displacement for Large-Scale Electric Taxi Fleets
    Wang, Guang
    Zhong, Shuxin
    Wang, Shuai
    Miao, Fei
    Dong, Zheng
    Zhang, Desheng
    2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 1200 - 1211
  • [50] A survey on datasets for fairness-aware machine learning
    Tai Le Quy
    Roy, Arjun
    Iosifidis, Vasileios
    Zhang, Wenbin
    Ntoutsi, Eirini
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2022, 12 (03)