Estimating the accuracy of geographical imputation

被引:62
作者
Henry, Kevin A. [1 ]
Boscoe, Francis P. [2 ]
机构
[1] New Jersey State Canc Registry, Canc Epidemiol Serv, New Jersey Dept Hlth & Senior Serv, Trenton, NJ USA
[2] New York State Dept Hlth, New York State Canc Registry, Albany, NY USA
关键词
D O I
10.1186/1476-072X-7-3
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Background: To reduce the number of non-geocoded cases researchers and organizations sometimes include cases geocoded to postal code centroids along with cases geocoded with the greater precision of a full street address. Some analysts then use the postal code to assign information to the cases from finer-level geographies such as a census tract. Assignment is commonly completed using either a postal centroid or by a geographical imputation method which assigns a location by using both the demographic characteristics of the case and the population characteristics of the postal delivery area. To date no systematic evaluation of geographical imputation methods ("geo-imputation") has been completed. The objective of this study was to determine the accuracy of census tract assignment using geo-imputation. Methods: Using a large dataset of breast, prostate and colorectal cancer cases reported to the New Jersey Cancer Registry, we determined how often cases were assigned to the correct census tract using alternate strategies of demographic based geo-imputation, and using assignments obtained from postal code centroids. Assignment accuracy was measured by comparing the tract assigned with the tract originally identified from the full street address. Results: Assigning cases to census tracts using the race/ethnicity population distribution within a postal code resulted in more correctly assigned cases than when using postal code centroids. The addition of age characteristics increased the match rates even further. Match rates were highly dependent on both the geographic distribution of race/ethnicity groups and population density. Conclusion: Geo-imputation appears to offer some advantages and no serious drawbacks as compared with the alternative of assigning cases to census tracts based on postal code centroids. For a specific analysis, researchers will still need to consider the potential impact of geocoding quality on their results and evaluate the possibility that it might introduce geographical bias.
引用
收藏
页数:10
相关论文
共 41 条
  • [1] [Anonymous], INT J HLTH GEOGR
  • [2] Ascential Software, 2006, QUALITYSTAGE GEOLOCA
  • [3] Travel distance to radiation therapy and receipt of radiotherapy following breast-conserving surgery
    Athas, WF
    Adams-Cameron, M
    Hung, WC
    Amir-Fazli, A
    Key, CR
    [J]. JOURNAL OF THE NATIONAL CANCER INSTITUTE, 2000, 92 (03) : 269 - 271
  • [4] BOSCOE FP, 2007, GEOCODING HLTH DATA, P95
  • [5] Accuracy of city postal code coordinates as a proxy for location of residence
    Bow C.J.D.
    Waters N.M.
    Faris P.D.
    Seidel J.E.
    Galbraith P.D.
    Knudtson M.L.
    Ghali W.A.
    [J]. International Journal of Health Geographics, 3 (1)
  • [6] Cai Q., 2006, Trans. GIS, V10, P577, DOI [DOI 10.1111/J.1467-9671.2006.01013.X, 10.1111/j.1467-9671.2006.01013.x]
  • [7] Geocoding public health data
    Carretta, HY
    Mick, SS
    [J]. AMERICAN JOURNAL OF PUBLIC HEALTH, 2003, 93 (05) : 699 - 699
  • [8] Positional error in automated geocoding of residential addresses
    Michael R Cayo
    Thomas O Talbot
    [J]. International Journal of Health Geographics, 2 (1)
  • [9] Spatial analysis of colorectal cancer incidence and proportion of late-stage in Massachusetts residents: 1995-1998
    DeChello, Laurie M.
    Sheehan, T. Joseph
    [J]. INTERNATIONAL JOURNAL OF HEALTH GEOGRAPHICS, 2007, 6 (1)
  • [10] Review: A gentle introduction to imputation of missing values
    Donders, A. Rogier T.
    van der Heijden, Geert J. M. G.
    Stijnen, Theo
    Moons, Karel G. M.
    [J]. JOURNAL OF CLINICAL EPIDEMIOLOGY, 2006, 59 (10) : 1087 - 1091