Data Quality - The Role of Empiricism

被引:22
作者
Sadiq, Shazia [1 ]
Dasu, Tamraparni [2 ]
Dong, Xin Luna [3 ]
Freire, Juliana [4 ]
Ilyas, Ihab F. [5 ]
Link, Sebastian [6 ]
Miller, Renee J. [7 ]
Naumann, Felix [8 ]
Zhou, Xiaofang [1 ]
Srivastava, Divesh [2 ]
机构
[1] Univ Queensland, Brisbane, Qld, Australia
[2] AT&T Labs Res, Florham Pk, NJ USA
[3] Amazon, Seattle, WA USA
[4] NYU, New York, NY 10003 USA
[5] Univ Waterloo, Waterloo, ON, Canada
[6] Univ Auckland, Auckland, New Zealand
[7] Univ Toronto, Toronto, ON, Canada
[8] Univ Potsdam, Hasso Plattner Inst, Potsdam, Germany
关键词
DISCOVERY;
D O I
10.1145/3186549.3186559
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We outline a call to action for promoting empiricism in data quality research. The action points result from an analysis of the landscape of data quality research. The landscape exhibits two dimensions of empiricism in data quality research relating to type of metrics and scope of method. Our study indicates the presence of a data continuum ranging from real to synthetic data, which has implications for how data quality methods are evaluated. The dimensions of empiricism and their inter-relationships provide a means of positioning data quality research, and help expose limitations, gaps and opportunities.
引用
收藏
页码:35 / 43
页数:9
相关论文
共 51 条
  • [1] Improving RDF Data Through Association Rule Mining
    Ziawasch Abedjan
    Felix Naumann
    [J]. Datenbank-Spektrum, 2013, 13 (2) : 111 - 120
  • [2] Abedjan Z, 2016, PROC VLDB ENDOW, V9, P993
  • [3] Abiteboul S., 2015, P 18 INT WORKSH WEB, P1
  • [4] Ananthakrishna R., 2002, Proceedings of the Twenty-eighth International Conference on Very Large Data Bases, P586
  • [5] [Anonymous], 2003, Exploratory Data Mining and Data Cleaning
  • [6] [Anonymous], 2007, P IEEE INT C DAT ENG
  • [7] [Anonymous], 2016, IEEE DATA ENG B
  • [8] [Anonymous], 2016, IEEE DATA ENG B
  • [9] Arocena P. C., 2016, IEEE DATA ENG B, V39, P47
  • [10] Arocena PC, 2015, PROC VLDB ENDOW, V9, P108