A classification of data quality assessment and improvement methods

被引:0
作者
机构
[1] Department of Engineering, Institute for Manufacturing, University of Cambridge, 17 Charles Babbage Road, Cambridge
[2] IBM Germany Research and Development, Schoenaicherstrasse 220, Boeblingen
[3] IBM Global Business Services, Hollerithstraße 1, Munich
来源
Woodall, Philip (phil.woodall@eng.cam.ac.uk) | 1600年 / Inderscience Enterprises Ltd., 29, route de Pre-Bois, Case Postale 856, CH-1215 Geneva 15, CH-1215, Switzerland卷 / 03期
关键词
Automated data quality software; Data quality; Data quality assessment; Data quality assessment methods; Data quality improvement automated data quality tools; Data quality improvement methods; Data quality software tools; Information quality;
D O I
10.1504/IJIQ.2014.068656
中图分类号
学科分类号
摘要
Data quality (DQ) assessment and improvement in larger information systems would often not be feasible without using suitable 'DQ methods', which are algorithms that can be automatically executed by computer systems to detect and/or correct problems in datasets. Currently, these methods are already essential, and they will be of even greater importance as the quantity of data in organisational systems grows. This paper provides a review of existing methods for both DQ assessment and improvement and classifies them according to the DQ problem and problem context. Six gaps have been identified in the classification, where no current DQ methods exist, and these show where new methods are required as a guide for future research and DQ tool development. Copyright © 2014 Inderscience Enterprises Ltd.
引用
收藏
页码:298 / 321
页数:23
相关论文
共 50 条
  • [31] Medical data quality assessment: On the development of an automated framework for medical data curation
    Pezoulas, Vasileios C.
    Kourou, Konstantina D.
    Kalatzis, Fanis
    Exarchos, Themis P.
    Venetsanopoulou, Aliki
    Zampeli, Evi
    Gandolfo, Saviana
    Skopouli, Fotini
    De Vita, Salvatore
    Tzioufas, Athanasios G.
    Fotiadis, Dimitrios I.
    COMPUTERS IN BIOLOGY AND MEDICINE, 2019, 107 : 270 - 283
  • [32] A Preliminary Study on Methods for Retaining Data Quality Problems in Automatically Generated Test Data
    Woodall, Philip
    Oberhofer, Martin
    Borek, Alexander
    AMCIS 2012 PROCEEDINGS, 2012,
  • [33] Data quality improvement in cellular production and delivery
    Zhen, WX
    Luo, YT
    PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT, VOLS 1 AND 2: INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT IN THE GLOBAL ECONOMY, 2005, : 257 - 261
  • [34] Methods of Data Quality Control Based on Quality Grading
    Guo, Yu Jian
    Hong, Liu Shuang
    2010 INTERNATIONAL CONFERENCE ON FUTURE CONTROL AND AUTOMATION (ICFCA 2010), 2010, : 85 - 88
  • [35] A Zero Trust Model Based Framework For Data Quality Assessment
    Mohammed, Mahmood
    Talburt, John R.
    Dagtas, Serhan
    Hollingsworth, Melissa
    2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2021), 2021, : 305 - 307
  • [36] A scope classification of data quality requirements for food composition data
    Presser, Karl
    Hinterberger, Hans
    Weber, David
    Norrie, Moira
    FOOD CHEMISTRY, 2016, 193 : 166 - 172
  • [37] Data preparation using data quality matrices for classification mining
    Davidson, Ian
    Tayi, Giri
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2009, 197 (02) : 764 - 772
  • [38] On the Impact of Data Quality on Image Classification Fairness
    Barry, Aki
    Han, Lei
    Demartini, Gianluca
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 2225 - 2229
  • [39] A General Framework for Data Uncertainty and Quality Classification
    Simard, Vanessa
    Ronnqvist, Mikael
    Lebel, Luc
    Lehoux, Nadia
    IFAC PAPERSONLINE, 2019, 52 (13): : 277 - 282
  • [40] On Studying the Effect of Data Quality on Classification Performances
    Jouseau, Roxane
    Salva, Sebastien
    Samir, Chafik
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2022, 2022, 13756 : 82 - 93