The use of uncertainty to choose matching variables in statistical matching

被引：4

作者：

D'Orazio, Marcello ^{[1
]}

Di Zio, Marco ^{[2
]}

Scanu, Mauro ^{[2
]}

机构：

[1] UN, FAO, Rome, Italy

[2] Ist Nazl Stat ISTAT, Rome, Italy

来源：

INTERNATIONAL JOURNAL OF APPROXIMATE REASONING | 2017年 / 90卷

关键词：

Data fusion; Synthetical matching; Consistency; Partial identifiability; PARTIALLY IDENTIFIED PARAMETERS; CONFIDENCE-INTERVALS;

D O I：

10.1016/j.ijar.2017.08.015

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Statistical matching aims at combining information available in distinct sample surveys referred to the same target population. The matching is usually based on a set of common variables shared by the available data sources. For matching purposes just a subset of all the common variables should be chosen, the so called matching variables. The paper presents a novel method for selecting the matching variables based on the analysis of the uncertainty characterizing the matching. framework. The uncertainty is caused by unavailability of data for estimating parameters describing the association between variables not jointly observed in a single data source. The paper focuses on the case of categorical variables and presents a sequential procedure for identifying the most effective subset of common variables in reducing the overall uncertainty. (C) 2017 Elsevier Inc. All rights reserved.

引用

页码：433 / 440

页数：8

共 24 条

[1] AN EMPIRICAL-INVESTIGATION OF SOME EFFECTS OF SPARSENESS IN CONTINGENCY-TABLES
AGRESTI, A
YANG, MC
[J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1987, 5 (01) : 9 - 21
[2] Agresti A., 2003, CATEGORICAL DATA ANA
[3] MAXIMUM-LIKELIHOOD ESTIMATES FOR A MULTIVARIATE NORMAL-DISTRIBUTION WHEN SOME OBSERVATIONS ARE MISSING
ANDERSON, TW
[J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1957, 52 (278) : 200 - 203
[4] [Anonymous], STATMATCH STAT MATCH
[5] Bishop M.M., 1975, DISCRETE MULTIVARIAT
[6] Cohen M., 1991, Improving Information For Social Policy Decisions: The Uses of Microsimulation Modeling, VII
[7] Statistical Matching Analysis for Complex Survey Data With Applications
Conti, Pier Luigi
Marella, Daniela
Scanu, Mauro
[J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2016, 111 (516) : 1715 - 1725
[8] How far from identifiability? A systematic overview of the statistical matching problem in a non parametric framework
Conti, Pier Luigi
Marella, Daniela
Scanu, Mauro
[J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2017, 46 (02) : 967 - 994
[9] Conti PL, 2012, J OFF STAT, V28, P69
[10] D'Orazio M., STAT MATCHING IMPUTA

← 1 2 3 →