Inference for superpopulation parameters using sample surveys

被引:46
作者
Graubard, BI
Korn, EL
机构
[1] NCI, Biostat Branch, Bethesda, MD 20892 USA
[2] NCI, Clin Trials Sect, Biometr Res Branch, Bethesda, MD 20892 USA
关键词
cluster sampling; complex survey data; design-based inference; model-based inference; random effects; stratified sampling;
D O I
10.1214/ss/1023798999
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Sample survey inference is historically concerned with finite-population parameters, that is, functions (like means and totals) of the observations for the individuals in the population. In scientific applications, however, interest usually focuses on the "superpopulation" parameters associated with a stochastic mechanism hypothesized to generate the observations in the population rather than the finite-population parameters. Two relevant findings discussed in this paper are that (1) with stratified sampling, it is not sufficient to drop finite-population correction factors from standard design-based valiance formulas to obtain appropriate variance formulas for superpopulation inference, and (2) with cluster sampling, standard design-based variance formulas can dramatically underestimate superpopulation variability, even with a small sampling fraction of the final units. A literature review of inference for superpopulation parameters is given, with emphasis on why these findings have not been previously appreciated. Examples are provided for estimating superpopulation means, linear regression coefficients and logistic regression coefficients using U.S. data from the 1987 National Health Interview Survey, the third National Health and Nutrition Examination Survey and the 1986 National Hospital Discharge Survey.
引用
收藏
页码:73 / 96
页数:24
相关论文
共 85 条
[1]  
[Anonymous], 2003, Model Assisted SurveySampling
[2]  
[Anonymous], 1994, VIT HLTH STAT, V1
[3]  
[Anonymous], P SECT SURV RES METH
[4]  
[Anonymous], 1981, STRUCTURAL ANAL DISC
[5]   ESTIMATION OF A FINITE POPULATION MEAN UNDER SUPERPOPULATION MODELS [J].
ARNAB, R .
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1992, 21 (06) :1717-1724
[6]   2-STAGE SAMPLING WITH EXCHANGEABLE PRIOR DISTRIBUTIONS [J].
BELLHOUSE, DR ;
THOMPSON, ME ;
GODAMBE, VP .
BIOMETRIKA, 1977, 64 (01) :97-103
[7]   LINEAR RANK-TESTS DERIVED FROM A SUPERPOPULATION MODEL [J].
BOUZA, CN .
BIOMETRICAL JOURNAL, 1995, 37 (04) :497-506
[8]   MAXIMUM-LIKELIHOOD INFERENCE FROM SAMPLE SURVEY DATA [J].
BRECKLING, JU ;
CHAMBERS, RL ;
DORFMAN, AH ;
TAM, SM ;
WELSH, AH .
INTERNATIONAL STATISTICAL REVIEW, 1994, 62 (03) :349-363
[9]   CHANGES IN THE USE OF SCREENING MAMMOGRAPHY - EVIDENCE FROM THE 1987 AND 1990 NATIONAL-HEALTH INTERVIEW SURVEYS [J].
BREEN, N ;
KESSLER, L .
AMERICAN JOURNAL OF PUBLIC HEALTH, 1994, 84 (01) :62-67
[10]  
CAMPBELL C, 1977, P SOC STAT, P800