Novel naive Bayes classification models for predicting the chemical Ames mutagenicity

被引:44
作者
Zhang, Hui [1 ,2 ,3 ]
Kang, Yan-Li [1 ]
Zhu, Yuan-Yuan [1 ]
Zhao, Kai-Xia [1 ]
Liang, Jun-Yu [1 ]
Ding, Lan [1 ]
Zhang, Teng-Guo [1 ]
Zhang, Ji [1 ,4 ]
机构
[1] Northwest Normal Univ, Coll Life Sci, Lanzhou 730070, Gansu, Peoples R China
[2] Sichuan Univ, West China Hosp, West China Med Sch, State Key Lab Biotherapy, Chengdu 610041, Sichuan, Peoples R China
[3] Sichuan Univ, West China Hosp, West China Med Sch, Canc Ctr, Chengdu 610041, Sichuan, Peoples R China
[4] Northwest Normal Univ, Bioact Prod Engn Res Ctr Gansu Distinct Plants, Lanzhou 730070, Gansu, Peoples R China
基金
中国国家自然科学基金;
关键词
Mutagenicity; Naive Bayes classifier; Recursive partitioning classifier; Molecular descriptors; Extended connectivity fingerprints (ECFP_14); IN-SILICO PREDICTION; STRUCTURAL ALERTS; EXPERT KNOWLEDGE; VALIDATION; TOXICOLOGY; IMPURITIES; DERIVATION; SOFTWARE; SYSTEMS; ASSAY;
D O I
10.1016/j.tiv.2017.02.016
中图分类号
R99 [毒物学(毒理学)];
学科分类号
100405 ;
摘要
Prediction of drug candidates for mutagenicity is a regulatory requirement since mutagenic compounds could pose a toxic risk to humans. The aim of this investigation was to develop a novel prediction model of mutagenicity by using a naive Bayes classifier. The established model was validated by the internal 5-fold cross validation and external test sets. For comparison, the recursive partitioning classifier prediction model was also established and other various reported prediction models of mutagenicity were collected. Among these methods, the prediction performance of naive Bayes classifier established here displayed very well and stable, which yielded average overall prediction accuracies for the internal 5-fold cross validation of the training set and external test set I set were 89.1 +/- 0.4% and 77.3 +/- 1.5%, respectively. The concordance of the external test set II with 446 marketed drugs was 90.9 +/- 03%. In addition, four simple molecular descriptors (e.g., Apol, No. of H donors, Num-Rings and Wiener) related to mutagenicity and five representative substructures of mutagens (e.g., aromatic nitro, hydroxyl amine, nitroso, aromatic amine and N-methyl-N-methylenemethanaminum) produced by ECFP_14 fingerprints were identified. We hope the established naive Bayes prediction model can be applied to risk assessment processes; and the obtained important information of mutagenic chemicals can guide the design of chemical libraries for hit and lead optimization. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:56 / 63
页数:8
相关论文
共 32 条
[1]   Computational Derivation of Structural Alerts from Large Toxicology Data Sets [J].
Ahlberg, Ernst ;
Carlsson, Lars ;
Boyer, Scott .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2014, 54 (10) :2945-2952
[2]   METHODS FOR DETECTING CARCINOGENS AND MUTAGENS WITH SALMONELLA-MAMMALIAN-MICROSOME MUTAGENICITY TEST [J].
AMES, BN ;
MCCANN, J ;
YAMASAKI, E .
MUTATION RESEARCH, 1975, 31 (06) :347-363
[3]  
Benigni R., 2008, BENIGNI BOSSA RULEBA
[4]   Structure alerts for carcinogenicity, and the Salmonella assay system:: A novel insight through the chemical relational databases technology [J].
Benigni, Romualdo ;
Bossa, Cecilia .
MUTATION RESEARCH-REVIEWS IN MUTATION RESEARCH, 2008, 659 (03) :248-261
[5]  
Berger JO., 2013, Statistical decision theory and Bayesian analysis
[6]  
Box G.E., 2011, Bayesian inference in statistical analysis
[7]   In silico screening of chemicals for bacterial mutagenicity using electrotopological E-state indices and MDL QSAR software [J].
Contrera, JF ;
Matthews, EJ ;
Kruhlak, NL ;
Benz, RD .
REGULATORY TOXICOLOGY AND PHARMACOLOGY, 2005, 43 (03) :313-323
[8]   Validation of Toxtree and SciQSAR in silico predictive software using a publicly available benchmark mutagenicity database and their applicability for the qualification of impurities in pharmaceuticals [J].
Contrera, Joseph F. .
REGULATORY TOXICOLOGY AND PHARMACOLOGY, 2013, 67 (02) :285-293
[9]   EMPIRIC COMPARISON OF MULTIVARIATE ANALYTIC TECHNIQUES - ADVANTAGES AND DISADVANTAGES OF RECURSIVE PARTITIONING ANALYSIS [J].
COOK, EF ;
GOLDMAN, L .
JOURNAL OF CHRONIC DISEASES, 1984, 37 (9-10) :721-731
[10]   Benchmark Data Set for in Silico Prediction of Ames Mutagenicity [J].
Hansen, Katja ;
Mika, Sebastian ;
Schroeter, Timon ;
Sutter, Andreas ;
ter Laak, Antonius ;
Steger-Hartmann, Thomas ;
Heinrich, Nikolaus ;
Mueller, Klaus-Robert .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2009, 49 (09) :2077-2081