Mining the Influence of Industrial Pollution on Cancer: An Improved Spatial Co-location Pattern

被引:0
作者
Zhang L. [1 ]
Wang L. [1 ,2 ]
Yang P. [1 ]
机构
[1] School of Information Science and Engineering, Yunnan University, Kunming
[2] Dianchi College of Yunnan University, Kunming
关键词
cancer; distance attenuation; influence factor; kernel density estimation; pollution sources; spatial data mining; spatial co-location pattern; spatial ordered-pair pattern;
D O I
10.12082/dqxxkx.2023.230148
中图分类号
学科分类号
摘要
About 60% of all known causes of cancer are related to environmental pollution. Identifying the spatial co-location pattern of prevalent neighbor spatial feature sets in geographical space is important to explore the potential relationship between industrial outdoor air pollutants and cancer risk. The traditional spatial co-location pattern mining algorithms usually calculate the prevalence of co-locations based on the frequency of cancer instances when measuring pattern interest. However, the influence of pollution source on cancer instances is also dependent on their proximity. In addition, pollution sources are also influenced by factors such as meteorological conditions, concentration levels, and the degree of harm. So, the pattern interest cannot be measured by relying solely on the number of instance occurences. To address this issue, a new spatial co-location pattern (called spatial ordered-pair pattern) is defined, and a novel mining algorithm is proposed based on the Gaussian kernel density estimation model. The Gaussian kernel function can well capture the decay of the influence of pollution sources on cancer cases with distance. To better represent the real-world diffusion of pollution sources, a spatial neighbor relationship between pollution source and cancer is defined, which considers urban wind direction, wind speed, and pollution emission concentration. Furthermore, pollution sources are categorized into different carcinogenic groups, and a weighted differentiation method is employed to distinguish pollutants based on their carcinogenic categories. The influence of various pollutants on cancer is calculated by weighting their contributions by the "carcinogenic coefficient." Therefore, a novel metric of the influence of pollution sources on cancer along with corresponding mining algorithm is presented. It not only effectively measures the impact of distance between pollution sources and cancer instances on the prevalence patterns but also models the mechanism of the influence of pollution sources on cancer by incorporating real-world conditions, overcoming the limitations of the traditional methods. Furthermore, this study improves the robustness of the method by using a smoothing factor to mitigate mining anomalies caused by uneven distributions of cancer instances. Finally, the effectiveness and efficiency of the metric and the mining algorithm proposed in this study are tested through experiments on real and synthetic datasets, and insights are also provided for cancer prevention and urban planning for Yunnan Province. The experimental results indicate that both the influence degree and participation index can accurately reflect the pattern interest from both macroscopic and microscopic perspectives. Furthermore, the mining efficiency increases by an average of 60% compared to other algorithms. The proposed influence degree measurement can more effectively capture spatial co-location patterns and can better reflect the impact of pollution sources on the incidence of cancer. © 2023 Journal of Geo-Information Science. All rights reserved.
引用
收藏
页码:2340 / 2360
页数:20
相关论文
共 26 条
[1]  
Sung H, Ferlay J, Siegel R L, Et al., Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries[J], CA: a Cancer Journal for Clinicians, 71, 3, pp. 209-249, (2021)
[2]  
Stocks P., On the relations between atmospheric pollution in urban and rural localities and mortality from cancer, bronchitis and pneumonia, with particular reference to 3: 4 benzopyrene, beryllium, molybdenum, vanadium and arsenic[J], British Journal of Cancer, 14, 3, pp. 397-418, (1960)
[3]  
Ramis R, Diggle P, Cambra K, Et al., Prostate cancer and industrial pollution risk around putative focus in a multisource scenario[J], Environment International, 37, 3, pp. 577-585, (2011)
[4]  
Santos-Sanchez V, Cordoba-Dona J A, Garcia-Perez J, Et al., Industrial pollution and mortality from digestive cancers at the small area level in a Spanish industrialized Province[J], Geospatial Health, 15, 1, pp. 147-155, (2020)
[5]  
Hwang J, Bae H, Choi S, Et al., Impact of air pollution on breast cancer incidence and mortality: A nationwide analysis in South Korea[J], Scientific Reports, 10, (2020)
[6]  
Lynge E, Holmsgaard H A, Holmager T L F, Et al., Cancer incidence in Thyborøn- Harboøre, Denmark: A cohort study from an industrially contaminated site[J], Scientific Reports, 11, (2021)
[7]  
Lagunas-Rangel F A, Linnea-Niemi J V, Kudlak B, Et al., Role of the synergistic interactions of environmental pollutants in the development of cancer, GeoHealth, 6, (2022)
[8]  
Turner M C, Andersen Z J, Baccarelli A, Et al., Outdoor air pollution and cancer: An overview of the current evidence and public health recommendations[J], CA: a Cancer Journal for Clinicians, 70, 6, pp. 460-479, (2020)
[9]  
Hill W, Lim E L, Weeden C E, Et al., Lung adenocarcinoma promotion by air pollutants[J], Nature, 616, 7955, pp. 159-167, (2023)
[10]  
Shekhar S, Huang Y., Discovering spatial co-location patterns: A summary of results[M], Advances in Spatial and Temporal Databases, pp. 236-256, (2001)