Quality-Guaranteed and Cost-Effective Population Health Profiling: A Deep Active Learning Approach

被引:1
作者
Chen L. [1 ]
Wang J. [1 ]
Thakuriah P. [2 ]
机构
[1] Center for Intelligent Healthcare, Coventry University, P.O. Box 412, West Midlands, Coventry
[2] Rutgers Urban and Civic Informatics Lab, Rutgers University, Civic Square Building, Rutgers, New Brunswick, NJ
来源
ACM Transactions on Computing for Healthcare | 2023年 / 4卷 / 04期
基金
英国工程与自然科学研究理事会;
关键词
convolutional neural networks (CNN); generative adversarial network; Profiling of prevalence; spatio-temporal correlations;
D O I
10.1145/3617179
中图分类号
学科分类号
摘要
Reliability and cost are two primary considerations for profiling population-scale prevalence (PPP) of multiple non-communicable diseases (NCDs). In this paper, we exploit intra-disease and inter-disease correlation in different traditionally-sensed-areas (TS-A) to reduce the number of profiling tasks required without compromising data reliability. Specifically, we propose a novel approach called Compressive Population Health TS-A Selection (CPH-TS), which blends the state-of-the-art profile inference, data augmentation and active learning in a unified deep learning framework. It can actively select the minimum number of TS-A regions for profiling task allocation in each profiling cycle, while deducing the missing data on the unprofiled regions with a probabilistic guarantee of reliability. We evaluate our approach on real-world prevalence datasets of London, which shows the effectiveness of CPH-TS. In general, CPH-TS assigned 11.1-27.3% fewer tasks than baselines, assigning tasks to only 34.7% of the sub-regions while the profiling error was below 5% for 95% of the cycles. © 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.
引用
收藏
相关论文
共 46 条
  • [1] Albawi S., Mohammed T.A., Al-Zawi S., Understanding of a convolutional neural network, 2017 International Conference on Engineering and Technology (ICET’17), pp. 1-6, (2017)
  • [2] Baker J., White N., Mengersen K., Missing in space: An evaluation of imputation methods for missing data in spatial analysis of risk factors for type II diabetes, International Journal of Health Geographics, 13, 1, pp. 1-13, (2014)
  • [3] Box G.E.P., Tiao G.C., Bayesian Inference in Statistical Analysis, (2011)
  • [4] Burbidge R., Rowland J.J., King R.D., Active learning for regression based on query by committee, International Conference on Intelligent Data Engineering and Automated Learning, pp. 209-218, (2007)
  • [5] Candes E.J., Recht B., Exact matrix completion via convex optimization, Foundations of Computational Mathematics, 9, 6, pp. 717-772, (2009)
  • [6] Chai T., Draxler R.R., Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against avoiding RMSE in the literature, Geoscientific Model Development, 7, 3, pp. 1247-1250, (2014)
  • [7] Cheng S., Lu F., Peng P., Wu S., Short-term traffic forecasting: An adaptive ST-KNN model that considers spatial heterogeneity, Computers, Environment and Urban Systems, 71, 2018, pp. 186-198, (2018)
  • [8] Cho J.W., Kim D.-J., Jung Y., Kweon I.S., MCDAL: Maximum classifier discrepancy for active learning, IEEE Transactions on Neural Networks and Learning Systems, 2022, (2022)
  • [9] Clements A.C.A., Deville M.-A., Ndayishimiye O., Brooker S., Fenwick A., Spatial co-distribution of neglected tropical diseases in the East African Great Lakes region: Revisiting the justification for integrated control, Tropical Medicine & International Health, 15, 2, pp. 198-207, (2010)
  • [10] Efron B., Bayesian inference and the parametric bootstrap, The Annals of Applied Statistics, 6, 4, (2012)