Quality Assessment and Biases in Reused Data

被引:3
|
作者
Fernandez-Ardevo, Mireia [1 ,2 ]
Rosales, Andrea [1 ,2 ]
机构
[1] Univ Oberta Catalunya UOC, Fac Informat & Commun Sci, Barcelona, Catalonia, Spain
[2] Univ Oberta Catalunya UOC, IN3 Internet Interdisciplinary Inst, Barcelona, Catalonia, Spain
关键词
data quality; data biases; reused data; reused traces; open data; online behavioral advertising;
D O I
10.1177/00027642221144855
中图分类号
B849 [应用心理学];
学科分类号
040203 ;
摘要
This article investigates digital and non-digital traces reused beyond the context of creation. A central idea of this article is that no (reused) dataset is perfect. Therefore, data quality assessment becomes essential to determine if a given dataset is "good enough" to be used to fulfill the users' goals. Biases, a possible source of discrimination, have become a relevant data challenge. Consequently, it is appropriate to analyze whether quality assessment indicators provide information on potential biases in the dataset. We use examples representing two opposing sides regarding data access to reflect on the relationship between quality and bias. First, the European Union open data portal fosters the democratization of data and expects users to manipulate the databases directly to perform their analyses. Second, online behavioral advertising systems offer individualized promotional services but do not share the datasets supporting their design. Quality assessment is socially constructed, as there is not a universal definition but a set of quality dimensions, which might change for each professional context. From the users' perspective, trust/credibility stands out as a relevant quality dimension in the two analyzed cases. Results show that quality indicators (whatever they are) provide limited information on potential biases. We suggest that data literacy is most needed among both open data users and clients of behavioral advertising systems. Notably, users must (be able to) understand the limitations of datasets for an optimal and bias-free interpretation of results and decision-making.
引用
收藏
页码:696 / 710
页数:15
相关论文
共 50 条
  • [21] Context-aware data quality assessment for big data
    Ardagna, Danilo
    Cappiello, Cinzia
    Sama, Walter
    Vitali, Monica
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 89 : 548 - 562
  • [22] Data quality assessment of routine operating data for process identification
    Shardt, Yuri A. W.
    Huang, Biao
    COMPUTERS & CHEMICAL ENGINEERING, 2013, 55 : 19 - 27
  • [23] Data quality assessment of maintenance reporting procedures
    Madhikermi, Manik
    Kubler, Sylvain
    Robert, Jeremy
    Buda, Andrea
    Framling, Kary
    EXPERT SYSTEMS WITH APPLICATIONS, 2016, 63 : 145 - 164
  • [24] A classification of data quality assessment and improvement methods
    Woodall, Philip (phil.woodall@eng.cam.ac.uk), 1600, Inderscience Enterprises Ltd., 29, route de Pre-Bois, Case Postale 856, CH-1215 Geneva 15, CH-1215, Switzerland (03): : 298 - 321
  • [25] BIGQA: Declarative Big Data Quality Assessment
    Fadlallah, Hadi
    Kilany, Rima
    Dhayne, Houssein
    El Haddad, Rami
    Haque, Rafiqul
    Taher, Yehia
    Jaber, Ali
    ACM JOURNAL OF DATA AND INFORMATION QUALITY, 2023, 15 (03):
  • [26] Data Warehouse Quality Assessment Using Contexts
    Serra, Flavia
    Marotta, Adriana
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2016, PT II, 2016, 10042 : 436 - 448
  • [27] Quality of fisheries data and uncertainty in stock assessment
    Chen, Y
    SCIENTIA MARINA, 2003, 67 : 75 - 87
  • [29] Quality Assessment for Open Government Data in China
    Li, Xiao-Tong
    Zhai, Jun
    Zheng, Gui-Fu
    Yuan, Chang-Feng
    ICIME 2018: PROCEEDINGS OF THE 2018 10TH INTERNATIONAL CONFERENCE ON INFORMATION MANAGEMENT AND ENGINEERING, 2018, : 110 - 114
  • [30] Towards Configurable Composite Data Quality Assessment
    Ceravolo, Paolo
    Bellini, Emanuele
    2019 IEEE 21ST CONFERENCE ON BUSINESS INFORMATICS (CBI), VOL 1, 2019, : 249 - 257