Designs for the Combination of Group- and Individual-level Data

被引:25
|
作者
Haneuse, Sebastien [1 ]
Bartell, Scott [2 ,3 ]
机构
[1] Harvard Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
[2] Univ Calif Irvine, Dept Epidemiol, Irvine, CA USA
[3] Univ Calif Irvine, Program Publ Hlth, Irvine, CA USA
关键词
DISEASE RISK; ECOLOGICAL INFERENCE; MORTALITY; EXPOSURE; AGGREGATE; BIASES; SENSITIVITY; REGRESSION; 2-PHASE; HEALTH;
D O I
10.1097/EDE.0b013e3182125cff
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Background: Studies of ecologic or aggregate data suffer from a broad range of biases when scientific interest lies with individual-level associations. To overcome these biases, epidemiologists can choose from a range of designs that combine these group-level data with individual-level data. The individual-level data provide information to identify, evaluate, and control bias, whereas the group-level data are often readily accessible and provide gains in efficiency and power. Within this context, the literature on developing models, particularly multilevel models, is well-established, but little work has been published to help researchers choose among competing designs and plan additional data collection. Methods: We review recently proposed "combined" group-and individual-level designs and methods that collect and analyze data at 2 levels of aggregation. These include aggregate data designs, hierarchical related regression, two-phase designs, and hybrid designs for ecologic inference. Results: The various methods differ in (i) the data elements available at the group and individual levels and (ii) the statistical techniques used to combine the 2 data sources. Implementing these techniques requires care, and it may often be simpler to ignore the group-level data once the individual-level data are collected. A simulation study, based on birth-weight data from North Carolina, is used to illustrate the benefit of incorporating group-level information. Conclusions: Our focus is on settings where there are individual-level data to supplement readily accessible group-level data. In this context, no single design is ideal. Choosing which design to adopt depends primarily on the model of interest and the nature of the available group-level data.
引用
收藏
页码:382 / 389
页数:8
相关论文
共 50 条