Estimating the count of completeness errors in geographic data sets by means of a generalized Waring regression model

被引:5
作者
Ariza-Lopez, F. J. [1 ]
Rodriguez-Avi, J. [2 ]
机构
[1] Univ Jaen, Dept Ingn Cartograf Geodes & Fotogrametria, Jaen, Spain
[2] Univ Jaen, Dept Estadist & Invest Operat, Jaen, Spain
关键词
quality; completeness; count of errors; generalized Waring regression model; OVERDISPERSION; INFORMATION;
D O I
10.1080/13658816.2015.1010536
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we propose a statistical model for estimating the probable number of completeness errors (omissions plus commissions) in a cell (a map tile or cluster) of a data set to guide updating or improvement efforts. The number of completeness errors is a count data variable related to some exogenous covariates that may also be known for each cell (e.g. count of features, rural or urban typology, etc.) and to other unknown variation sources. We propose and adjust a generalized Waring regression model for counting these errors in cells of 1x1km(2) on the Topographic Map of Andalusia (Spain). This model is compared with the Poisson regression model and the negative binomial regression model and performs better. The empirical relationship established by the model indicates that the number of completeness errors is related to the following exogenous covariates: the number of cartographic features of the data set, the fact that the cell covers a littoral or urban zone and the spatial division of the contracted suppliers. For cells having less than 5 errors, most of the variability corresponds to unknown external factors (liability), but when the number of errors rises, the greater part of the variability is due to unknown internal characteristics of each cell (proneness). With these estimations, the producer can derivate statistical summaries and spatial representations and develop better planning of production activities such as actualization.
引用
收藏
页码:1394 / 1418
页数:25
相关论文
共 47 条
[1]  
[Anonymous], 2005, EVALUACION EXACTITUD
[2]  
[Anonymous], 2002, Report of a questionnaire on data quality in National Mapping Agencies
[3]  
Ariza-Lopez F. J., 2005, EVALUACION COMPLECIO
[4]  
Ariza-Lopez F. J, 2013, FUNDAMENTOS EVALUACI
[5]  
Aronoff Stan., 1995, Geographic Information Systems
[6]  
Cameron A.C., 2013, REGRESSION ANAL COUN, V2nd ed., DOI DOI 10.1017/CBO9781139013567
[7]  
Cameron A. C., 1986, Journal of Applied Econometrics, V1, P29, DOI [10.1002/jae.3950010104, DOI 10.1002/JAE.3950010104]
[8]  
Cochran W.B., 1963, Sampling Techniques
[9]   Using control data to determine the reliability of volunteered geographic information about land cover [J].
Comber, Alexis ;
See, Linda ;
Fritz, Steffen ;
Van der Velde, Marijn ;
Perger, Christoph ;
Foody, Giles .
INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2013, 23 :37-48
[10]   Power series generalized nonlinear models [J].
Cordeiro, Gauss M. ;
Andrade, Marinho G. ;
de Castro, Mario .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2009, 53 (04) :1155-1166