共 1 条
Zeros and ones: a case for suppressing zeros in sensitive count data with an application to stroke mortality
被引:8
|作者:
Quick, Harrison
[1
]
Holan, Scott H.
[2
]
Wikle, Christopher K.
[2
]
机构:
[1] Ctr Dis Control & Prevent, Div Heart Dis & Stroke Prevent, Atlanta, GA 30329 USA
[2] Univ Missouri, Dept Stat, Columbia, MO 65211 USA
来源:
STAT
|
2015年
/
4卷
/
01期
基金:
美国国家科学基金会;
关键词:
Bayesian methods;
data privacy;
disclosure limitation;
spatial data analysis;
synthetic data;
D O I:
10.1002/sta4.92
中图分类号:
O21 [概率论与数理统计];
C8 [统计学];
学科分类号:
020208 ;
070103 ;
0714 ;
摘要:
In the current era of global internet connectivity, privacy concerns are of the utmost importance. When official statistical agencies collect spatially referenced, confidential data that they intend to release as public-use files, the suppression of small counts is a common measure that agencies take to protect the confidentiality of the data-subjects from ill-intentioned users. The goal of this paper is to demonstrate that an interval suppression criterion that does not suppress zeros can fail to protect regions with a single occurrence. We illustrate the difference in disclosure risk between an interval suppression criterion and a one-sided suppression criterion by considering a US county-level dataset composed of the number of deaths due to stroke in White men. Here, we illustrate that an interval suppression criterion leads to a twofold increase in the disclosure risk when compared with a one-sided suppression criterion for regions with a single incidence among a population of less than 600. We conclude with an extension of these findings beyond stroke mortality and by offering general guidelines for data suppression. Copyright (C) 2015 John Wiley & Sons, Ltd.
引用
收藏
页码:227 / 234
页数:8
相关论文