Using internet search data to predict new HIV diagnoses in China: a modelling study

被引:17
作者
Zhang, Qingpeng [1 ,2 ]
Chai, Yi [1 ,3 ]
Li, Xiaoming [4 ]
Young, Sean D. [5 ]
Zhou, Jiaqi [1 ]
机构
[1] City Univ Hong Kong, Dept Syst Engn & Engn Management, Kowloon, Hong Kong, Peoples R China
[2] City Univ Hong Kong, Shenzhen Res Inst, Shenzhen, Peoples R China
[3] Univ Hong Kong, Dept Social Work & Social Adm, Hong Kong, Peoples R China
[4] Univ South Carolina, Arnold Sch Publ Hlth, Columbia, SC 29208 USA
[5] Univ Calif Los Angeles, Dept Family Med Univ, Univ Calif Inst Predict Technol, Los Angeles, CA USA
基金
中国国家自然科学基金; 美国国家卫生研究院;
关键词
HEALTH INFORMATION; TRANSMITTED-DISEASE; SEX; MEN; ONLINE; RATES; TECHNOLOGIES; HIV/AIDS; EPIDEMIC; ACCESS;
D O I
10.1136/bmjopen-2017-018335
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Objectives Internet data are important sources of abundant information regarding HIV epidemics and risk factors. A number of case studies found an association between internet searches and outbreaks of infectious diseases, including HIV. In this research, we examined the feasibility of using search query data to predict the number of new HIV diagnoses in China. Design We identified a set of search queries that are associated with new HIV diagnoses in China. We developed statistical models (negative binomial generalised linear model and its Bayesian variants) to estimate the number of new HIV diagnoses by using data of search queries (Baidu) and official statistics (for the entire country and for Guangdong province) for 7 years (2010 to 2016). Results Search query data were positively associated with the number of new HIV diagnoses in China and in Guangdong province. Experiments demonstrated that incorporating search query data could improve the prediction performance in nowcasting and forecasting tasks. Conclusions Baidu data can be used to predict the number of new HIV diagnoses in China up to the province level. This study demonstrates the feasibility of using search query data to predict new HIV diagnoses. Results could potentially facilitate timely evidence-based decision making and complement conventional programmes for HIV prevention.
引用
收藏
页数:9
相关论文
共 62 条
[1]  
Achrekar H., 2011, IEEE INFOCOM 2011 - IEEE Conference on Computer Communications. Workshops, P702, DOI 10.1109/INFCOMW.2011.5928903
[2]  
[Anonymous], AIDS REL QUEST ANSW
[3]  
[Anonymous], P DAT MIN INT KNOW M
[4]  
[Anonymous], 2017, 39 CHIN STAT REP INT
[5]  
[Anonymous], 2009, NATURE, DOI DOI 10.1038/nature07634
[6]  
[Anonymous], CHINESE J AIDS STD
[7]  
[Anonymous], UPD AIDS STD EP CHIN
[8]   Seasonality in Seeking Mental Health Information on Google [J].
Ayers, John W. ;
Althouse, Benjamin M. ;
Allem, Jon-Patrick ;
Rosenquist, J. Niels ;
Ford, Daniel E. .
AMERICAN JOURNAL OF PREVENTIVE MEDICINE, 2013, 44 (05) :520-525
[9]  
Bishop Christopher M, 2016, Pattern recognition and machine learning
[10]  
Cameron C., 2013, REGRESSION ANAL COUN