TREATS: Fairness-aware entity resolution over streaming data

被引:0
|
作者
Araujo, Tiago Brasileiro [1 ,2 ]
Efthymiou, Vasilis [3 ,4 ]
Christophides, Vassilis [5 ]
Pitoura, Evaggelia [6 ]
Stefanidis, Kostas [1 ]
机构
[1] Tampere Univ, Tampere, Finland
[2] Fed Inst Paraiba, Soledade, Brazil
[3] Harokopio Univ Athens, Athens, Greece
[4] FORTH ICS, Iraklion, Greece
[5] ENSEA, ETIS, Paris, France
[6] Univ Ioannina, Ioannina, Greece
关键词
Entity resolution; Streaming data; Fairness; Incremental processing; Distributed processing; Machine learning;
D O I
10.1016/j.is.2024.102506
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Currently, the growing proliferation of information systems generates large volumes of data continuously, stemming from a variety of sources such as web platforms, social networks, and multiple devices. These data, often lacking a defined schema, require an initial process of consolidation and cleansing before analysis and knowledge extraction can occur. In this context, Entity Resolution (ER) plays a crucial role, facilitating the integration of knowledge bases and identifying similarities among entities from different sources. However, the traditional ER process is computationally expensive, and becomes more complicated in the streaming context where the data arrive continuously. Moreover, there is a lack of studies involving fairness and ER, which is related to the absence of discrimination or bias. In this sense, fairness criteria aim to mitigate the implications of data bias in ER systems, which requires more than just optimizing accuracy, as traditionally done. Considering this context, this work presents TREATS, a schema-agnostic and fairness-aware ER workflow developed for managing streaming data incrementally. The proposed fairness-aware ER framework tackles constraints across various groups of interest, presenting a resilient and equitable solution to the related challenges. Through experimental evaluation, the proposed techniques and heuristics are compared against state-of-the-art approaches over five real-world data source pairs, in which the results demonstrated significant improvements in terms of fairness, without degradation of effectiveness and efficiency measures in the streaming environment. In summary, our contributions aim to propel the ER field forward by providing a workflow that addresses both technical challenges and ethical concerns.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] Algorithmic Bias: From Discrimination Discovery to Fairness-aware Data Mining
    Hajian, Sara
    Bonchi, Francesco
    Castillo, Carlos
    KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 2125 - 2126
  • [42] Fairness-Aware Inter-Slice Scheduler for IoT Services Over Satellite
    Maity, Ilora
    Chougrani, Houcine
    Chatzinotas, Symeon
    IEEE OPEN JOURNAL OF THE COMMUNICATIONS SOCIETY, 2023, 4 : 3040 - 3050
  • [43] Can Fairness be Automated? Guidelines and Opportunities for Fairness-aware AutoML
    Weerts, Hilde
    Pfisterer, Florian
    Feurer, Matthias
    Eggensperger, Katharina
    Bergman, Edward
    Awad, Noor
    Vanschoren, Joaquin
    Pechenizkiy, Mykola
    Bischl, Bernd
    Hutter, Frank
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2023, 79 : 639 - 677
  • [44] A survey on datasets for fairness-aware machine learning
    Tai Le Quy
    Roy, Arjun
    Iosifidis, Vasileios
    Zhang, Wenbin
    Ntoutsi, Eirini
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2022, 12 (03)
  • [45] A novel fairness-aware parallel download scheme
    Kim, Eunhye
    Karrer, Roger P.
    Park, Ju-Won
    Kim, Sehun
    PEER-TO-PEER NETWORKING AND APPLICATIONS, 2016, 9 (01) : 42 - 53
  • [46] Fairness-Aware Optimal Graph Filter Design
    Kose, O. Deniz
    Mateos, Gonzalo
    Shen, Yanning
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2024, 18 (02) : 142 - 154
  • [47] Fairness-aware Adaptive Network Link Prediction
    Kose, O. Deniz
    Shen, Yanning
    2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 677 - 681
  • [48] FAIR: Fairness-aware information retrieval evaluation
    Gao, Ruoyuan
    Ge, Yingqiang
    Shah, Chirag
    JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2022, 73 (10) : 1461 - 1473
  • [49] Fairness-aware Configuration of Machine Learning Libraries
    Tizpaz-Niari, Saeid
    Kumar, Ashish
    Tan, Gang
    Trivedi, Ashutosh
    2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), 2022, : 909 - 920
  • [50] How Fair is Fairness-aware Representative Ranking?
    Saxena, Akrati
    Fletcher, George
    Pechenizkiy, Mykola
    WEB CONFERENCE 2021: COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2021), 2021, : 161 - 165