ANN Multiscale Model of Anti-HIV Drugs Activity vs AIDS Prevalence in the US at County Level Based on Information Indices of Molecular Graphs and Social Networks

被引:56
作者
Gonzalez-Diaz, Humberto [1 ,2 ]
Maria Herrera-Ibata, Diana [3 ]
Duardo-Sanchez, Aliuska [3 ]
Munteanu, Cristian R. [3 ]
Alfredo Orbegozo-Medina, Ricardo [4 ]
Pazos, Alejandro [3 ]
机构
[1] Univ Basque Country UPV EHU, Dept Organ Chem 2, Fac Sci & Technol, Leioa 48940, Vizcaya, Spain
[2] Basque Fdn Sci, IKERBASQUE, Bilbao 48011, Vizcaya, Spain
[3] Univ A Coruna UDC, Dept Informat & Commun Technol, La Coruna 15071, Spain
[4] USC, Dept Microbiol & Parasitol, Santiago De Compostela 15782, A Coruna, Spain
关键词
SHANNON ENTROPY ANALYSIS; ORGANIC-MOLECULES; NEURAL-NETWORKS; TOPOLOGICAL INDEXES; CHEMICAL GRAPHS; BIG DATA; QSAR; CHEMOINFORMATICS; DISCOVERY; OPTIMIZATION;
D O I
10.1021/ci400716y
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
This work is aimed at describing the workflow for a methodology that combines chemoinformatics and pharmacoepidemiology methods and at reporting the first predictive model developed with this methodology. The new model is able to predict complex networks of AIDS prevalence in the US counties, taking into consideration the social determinants and activity/structure of anti-HIV drugs in preclinical assays. We trained different Artificial Neural Networks (ANNs) using as input information indices of social networks and molecular graphs. We used a Shannon information index based on the Gini coefficient to quantify the effect of income inequality in the social network. We obtained the data on AIDS prevalence and the Gini coefficient from the AIDSVu database of Emory University. We also used the Balaban information indices to quantify changes in the chemical structure of anti-HIV drugs. We obtained the data on anti-HIV drug activity and structure (SMILE codes) from the ChEMBL database. Last, we used Box-Jenkins moving average operators to quantify information about the deviations of drugs with respect to data subsets of reference (targets, organisms, experimental parameters, protocols). The best model found was a Linear Neural Network (LNN) with values of Accuracy, Specificity, and Sensitivity above 0.76 and AUROC > 0.80 in training and external validation series. This model generates a complex network of AIDS prevalence in the US at county level with respect to the preclinical activity of anti-HIV drugs in preclinical assays. To train/validate the model and predict the complex network we needed to analyze 43,249 data points including values of AIDS prevalence in 2,310 counties in the US vs ChEMBL results for 21,582 unique drugs, 9 viral or human protein targets, 4,856 protocols, and 10 possible experimental measures.
引用
收藏
页码:744 / 755
页数:12
相关论文
共 85 条
[1]   Modelling of carbonic anhydrase inhibitory activity of sulfonamides using molecular negentropy [J].
Agrawal, VK ;
Khadikar, PV .
BIOORGANIC & MEDICINAL CHEMISTRY LETTERS, 2003, 13 (03) :447-453
[2]  
[Anonymous], 2005, DRAGON VERSION 5 3
[3]  
[Anonymous], 2001, STATISTICA VERSION 6
[4]  
[Anonymous], 2006, STAT METHODS APPL CO
[5]   Proteins, drug targets and the mechanisms they control: the simple truth about complex networks [J].
Araujo, Robyn P. ;
Liotta, Lance A. ;
Petricoin, Emanuel F. .
NATURE REVIEWS DRUG DISCOVERY, 2007, 6 (11) :871-880
[6]   NEW VERTEX INVARIANTS AND TOPOLOGICAL INDEXES OF CHEMICAL GRAPHS BASED ON INFORMATION ON DISTANCES [J].
BALABAN, AT ;
BALABAN, TS .
JOURNAL OF MATHEMATICAL CHEMISTRY, 1991, 8 (04) :383-397
[7]   Network medicine: a network-based approach to human disease [J].
Barabasi, Albert-Laszlo ;
Gulbahce, Natali ;
Loscalzo, Joseph .
NATURE REVIEWS GENETICS, 2011, 12 (01) :56-68
[8]   Application of Artificial Neural Networks (ANNs) in Wine Technology [J].
Baykal, Halil ;
Yildirim, Hatice Kalkan .
CRITICAL REVIEWS IN FOOD SCIENCE AND NUTRITION, 2013, 53 (05) :415-421
[9]   ISOMER DISCRIMINATION BY TOPOLOGICAL INFORMATION APPROACH [J].
BONCHEV, D ;
MEKENYAN, O ;
TRINAJSTIC, N .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 1981, 2 (02) :127-148
[10]  
BONCHEV D, 1976, B MATH BIOL, V38, P119, DOI 10.1016/S0092-8240(76)80029-8