Beat the Heap: An Imputation Strategy for Valid Inferences from Rounded Income Data

被引:11
作者
Drechsler, Joerg [1 ]
Kiesl, Hans [2 ]
机构
[1] Inst Employment Res, Regensburger Str 104, D-90478 Nurnberg, Germany
[2] OTH Regensburg, Dept Comp Sci & Math, Regensburg, Germany
关键词
Heaping; Measurement error; Multiple imputation; Poverty rate;
D O I
10.1093/jssam/smv032
中图分类号
O1 [数学]; C [社会科学总论];
学科分类号
03 ; 0303 ; 0701 ; 070101 ;
摘要
Questions on income in surveys are prone to two sources of errors that can cause bias if not addressed adequately at the analysis stage. On the one hand, income is considered sensitive information, and response rates on income questions generally tend to be lower than response rates for other nonsensitive questions. On the other hand, respondents usually do not remember their exact income and thus tend to provide a rounded estimate. The negative effects of item nonresponse are well studied, and most statistical agencies have developed sophisticated imputation methods to correct for this potential source of bias. However, to our knowledge, the effects of rounding are hardly ever considered in practice, despite the fact that several studies have found strong evidence that most of the respondents round their reported income values. In this article, we illustrate the substantial impact that rounding can have on important measures derived from the income variable, such as the poverty rate. To obtain unbiased estimates, we propose a two-stage imputation strategy that estimates the posterior probability for rounding given the observed income values at the first stage and reimputes the observed income values given the rounding probabilities at the second stage. A simulation study shows that the proposed imputation model can help overcome the possible negative effects of rounding. We also present results based on the household income variable from the German panel study Labour Market and Social Security.
引用
收藏
页码:22 / 42
页数:21
相关论文
共 31 条
[11]  
HEITJAN DF, 1994, BIOMETRIKA, V81, P701
[12]   IGNORABILITY AND COARSE DATA [J].
HEITJAN, DF ;
RUBIN, DB .
ANNALS OF STATISTICS, 1991, 19 (04) :2244-2253
[13]   INFERENCE FROM COARSE DATA VIA MULTIPLE IMPUTATION WITH APPLICATION TO AGE HEAPING [J].
HEITJAN, DF ;
RUBIN, DB .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1990, 85 (410) :304-314
[14]  
HEITJAN DF, 1989, STAT SCI, V4, P164, DOI DOI 10.1214/SS/1177012601
[15]   Revisit of Sheppard corrections in linear regression [J].
Liu TianQing ;
Zhang BaoXue ;
Hu GuoRong ;
Bai ZhiDong .
SCIENCE CHINA-MATHEMATICS, 2010, 53 (06) :1435-1451
[16]   Rounding Probabilistic Expectations in Surveys [J].
Manski, Charles F. ;
Molinari, Francesca .
JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2010, 28 (02) :219-231
[17]   MULTIPLE-IMPUTATION INFERENCES WITH UNCONGENIAL SOURCES OF INPUT [J].
MENG, XL .
STATISTICAL SCIENCE, 1994, 9 (04) :538-558
[18]   SAMPLING DISTRIBUTIONS OF RELATIVE POVERTY STATISTICS [J].
PRESTON, I .
APPLIED STATISTICS-JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C, 1995, 44 (01) :91-99
[19]  
Raghunathan T., 2001, SURV METHODOL, V27, P85, DOI DOI 10.1037/A0029315
[20]  
Rubin D.B., 1978, P SURV RES METH SECT, P20, DOI DOI 10.1631/JZUS.C10B0359