The use of uncertainty to choose matching variables in statistical matching

被引:4
作者
D'Orazio, Marcello [1 ]
Di Zio, Marco [2 ]
Scanu, Mauro [2 ]
机构
[1] UN, FAO, Rome, Italy
[2] Ist Nazl Stat ISTAT, Rome, Italy
关键词
Data fusion; Synthetical matching; Consistency; Partial identifiability; PARTIALLY IDENTIFIED PARAMETERS; CONFIDENCE-INTERVALS;
D O I
10.1016/j.ijar.2017.08.015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Statistical matching aims at combining information available in distinct sample surveys referred to the same target population. The matching is usually based on a set of common variables shared by the available data sources. For matching purposes just a subset of all the common variables should be chosen, the so called matching variables. The paper presents a novel method for selecting the matching variables based on the analysis of the uncertainty characterizing the matching. framework. The uncertainty is caused by unavailability of data for estimating parameters describing the association between variables not jointly observed in a single data source. The paper focuses on the case of categorical variables and presents a sequential procedure for identifying the most effective subset of common variables in reducing the overall uncertainty. (C) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页码:433 / 440
页数:8
相关论文
共 24 条