共 7 条
Generating partially synthetic geocoded public use data with decreased disclosure risk by using differential smoothing
被引:8
|作者:
Quick, Harrison
[1
]
Holan, Scott H.
[2
,3
]
Wikle, Christopher K.
[2
]
机构:
[1] Drexel Univ, Philadelphia, PA 19104 USA
[2] Univ Missouri, Columbia, MO 65211 USA
[3] US Census Bur, Washington, DC USA
基金:
美国国家科学基金会;
关键词:
Bayesian methods;
Data privacy;
Multiple imputation;
Spatial modelling;
Synthetic data;
MULTIPLE IMPUTATION;
D O I:
10.1111/rssa.12360
中图分类号:
O1 [数学];
C [社会科学总论];
学科分类号:
03 ;
0303 ;
0701 ;
070101 ;
摘要:
When collecting geocoded confidential data with the intent to disseminate, agencies often resort to altering the geographies before making data publicly available. An alternative to releasing aggregated and/or perturbed data is to release synthetic data, where sensitive values are replaced with draws from models designed to capture distributional features in the data collected. The issues associated with spatially outlying observations in the data, however, have received relatively little attention. Our goal here is to shed light on this problem, to propose a solution-referred to as 'differential smoothing'-and to illustrate our approach by using sale prices of homes in San Francisco.
引用
收藏
页码:649 / 661
页数:13
相关论文