Data Quality in Secondary Data Analysis: A Case Study of Ecological Data using a Semiotic-based Approach

被引:1
作者
Kwiatkowska, Mila [1 ]
Pouw, Frank [2 ]
机构
[1] Thompson Rivers Univ, Dept Comp Sci, 805 TRU Way, Kamloops, BC, Canada
[2] Thompson Rivers Univ, Dept Environm Sci, 805 TRU Way, Kamloops, BC, Canada
来源
PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON DATA SCIENCE, TECHNOLOGY AND APPLICATIONS (DATA) | 2019年
关键词
Data Quality; Secondary Data Analysis; Ecological Data; Semiotics;
D O I
10.5220/0007978403770384
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Data quality problems are widespread in secondary data when they are used for data warehousing and data mining. This paper advocates a broad semiotic approach to data quality. The main premises of this expanded semiotic framework are (1) data represent some reality, (2) data are created and interpreted by humans in a communication process, (3) data are used for specific purposes by humans, and (4) data cannot be created, interpreted and used without knowledge. Thus, the semiotic-based approach to data quality in secondary data analysis has four aspects: (1) representational, (3) communicational, (3) pragmatic, and (4) knowledge-based. To illustrate these four characteristics, we present a case study of ecological data analysis used in the creation of an ornithological data warehouse. We discuss the temporal data (ecological notion of time), spatial ecological data (communication processes and protocols used for data collection), and bioacoustic data processing (domain knowledge needed for the specification of data provenance).
引用
收藏
页码:377 / 384
页数:8
相关论文
共 31 条
[1]  
[Anonymous], 1991, THEORY COMPUTER SEMI
[2]  
[Anonymous], 2013, MEASURING DATA QUALI, DOI DOI 10.1016/B978-0-12-397033-6.00014-6
[3]  
[Anonymous], 1997, P 1997 C INF QUAL CA
[4]   Component-based end-user database design for ecologists [J].
Cushing, Judith Bayard ;
Nadkarni, Nalini ;
Finch, Michael ;
Fiala, Anne ;
Murphy-Hill, Emerson ;
Delcambre, Lois ;
Maier, David .
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2007, 29 (01) :7-24
[5]   Knowledge-Intensive Processes: Characteristics, Requirements and Analysis of Contemporary Approaches [J].
Di Ciccio, Claudio ;
Marrella, Andrea ;
Russo, Alessandro .
JOURNAL ON DATA SEMANTICS, 2015, 4 (01) :29-57
[6]   Twilight ascents by common swifts, Apus apus, at dawn and dusk: acquisition of orientation cues? [J].
Dokter, Adriaan M. ;
Akesson, Susanne ;
Beekhuis, Hans ;
Bouten, Willem ;
Buurma, Luit ;
van Gasteren, Hans ;
Holleman, Iwan .
ANIMAL BEHAVIOUR, 2013, 85 (03) :545-552
[7]  
Ducks Unlimited, 2015, FIELD GUID BOR WETL
[8]   Big data and the future of ecology [J].
Hampton, Stephanie E. ;
Strasser, Carly A. ;
Tewksbury, Joshua J. ;
Gram, Wendy K. ;
Budden, Amber E. ;
Batcheller, Archer L. ;
Duke, Clifford S. ;
Porter, John H. .
FRONTIERS IN ECOLOGY AND THE ENVIRONMENT, 2013, 11 (03) :156-162
[9]  
HUANG YAN., 2007, Pragmatics
[10]  
Ivanov K, 1972, THESIS