Evaluation of Different Methods for Identification of Structural Alerts Using Chemical Ames Mutagenicity Data Set as a Benchmark

被引:54
作者
Yang, Hongbin [1 ]
Li, Jie [1 ]
Wu, Zengrui [1 ]
Li, Weihua [1 ]
Liu, Guixia [1 ]
Tang, Yun [1 ]
机构
[1] East China Univ Sci & Technol, Sch Pharm, Shanghai Key Lab New Drug Design, Shanghai 200237, Peoples R China
基金
中国国家自然科学基金;
关键词
IN-SILICO PREDICTION; SALMONELLA MUTAGENICITY; TOXICITY; CARCINOGENICITY; DERIVATION; SUBSTRUCTURES; VALIDATION; DISCOVERY; LIBRARY; BINDING;
D O I
10.1021/acs.chemrestox.7b00083
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Identification of structural alerts for toxicity is useful in drug discovery and other fields such as environmental protection. With structural alerts, researchers can quickly identify potential toxic compounds and learn how to modify them. Hence, it is important to determine structural alerts from a large number of compounds quickly and accurately. There are already many methods reported for identification of structural alerts. However, how to evaluate those methods is a problem. In this paper, we tried to evaluate four of the methods for monosubstructure identification with three indices including accuracy rate, coverage rate, and information gain to compare their advantages and disadvantages. The Kazins' Ames mutagenicity data set was used as the benchmark, and the four methods were MoSS (graph-based), SARpy (fragment-based), and, two fingerprint-based methods including Bioalerts and the fingerprint (FP) method we previously used. The results showed that Bioalerts and FP could detect key substructures with high accuracy and coverage rates because they allowed unclosed rings and wildcard atom or bond-types. However, they also resulted in redundancy so that their predictive performance was not as good as that of SARpy. SARpy was competitive in predictive performance in both training set and external validation set. these results might be helpful for users to select appropriate methods and further development of, methods for identification of structural alerts.
引用
收藏
页码:1355 / 1364
页数:10
相关论文
共 61 条
[21]   The signature molecular descriptor. 1. Using extended valence sequences in QSAR and QSPR studies [J].
Faulon, JL ;
Visco, DP ;
Pophale, RS .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2003, 43 (03) :707-720
[22]   Automatic knowledge extraction from chemical structures: the case of mutagenicity prediction [J].
Ferrari, T. ;
Cattaneo, D. ;
Gini, G. ;
Bakhtyari, N. Golbamaki ;
Manganaro, A. ;
Benfenati, E. .
SAR AND QSAR IN ENVIRONMENTAL RESEARCH, 2013, 24 (05) :631-649
[23]   Fragment Prioritization on a Large Mutagenicity Dataset [J].
Floris, Matteo ;
Raitano, Giuseppa ;
Medda, Ricardo ;
Benfenati, Emilio .
MOLECULAR INFORMATICS, 2017, 36 (07)
[24]  
Golbamaki A, 2016, METHODS MOL BIOL, V1425, P107, DOI 10.1007/978-1-4939-3609-0_6
[25]   New clues on carcinogenicity-related substructures derived from mining two large datasets of chemical compounds [J].
Golbamaki, Azadi ;
Benfenati, Emilio ;
Golbamaki, Nazanin ;
Manganaro, Alberto ;
Merdivan, Erinc ;
Roncaglioni, Alessandra ;
Gini, Giuseppina .
JOURNAL OF ENVIRONMENTAL SCIENCE AND HEALTH PART C-ENVIRONMENTAL CARCINOGENESIS & ECOTOXICOLOGY REVIEWS, 2016, 34 (02) :97-113
[26]   Benchmark Data Set for in Silico Prediction of Ames Mutagenicity [J].
Hansen, Katja ;
Mika, Sebastian ;
Schroeter, Timon ;
Sutter, Andreas ;
ter Laak, Antonius ;
Steger-Hartmann, Thomas ;
Heinrich, Nikolaus ;
Mueller, Klaus-Robert .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2009, 49 (09) :2077-2081
[27]   In silico prediction of cytochrome P450 2D6 and 3A4 inhibition using Gaussian kernel weighted k-nearest neighbor and extended connectivity fingerprints, including structural fragment analysis of inhibitors versus noninhibitors [J].
Jensen, Berith F. ;
Vind, Christian ;
Padkjaer, Soren B. ;
Brockhoff, Per B. ;
Refsgaard, Hanne H. F. .
JOURNAL OF MEDICINAL CHEMISTRY, 2007, 50 (03) :501-511
[28]   Derivation and validation of toxicophores for mutagenicity prediction [J].
Kazius, J ;
McGuire, R ;
Bursi, R .
JOURNAL OF MEDICINAL CHEMISTRY, 2005, 48 (01) :312-320
[29]   Substructure mining using elaborate chemical representation [J].
Kazius, J ;
Nijssen, S ;
Kok, J ;
Bäck, T ;
Ijzerman, AP .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2006, 46 (02) :597-605
[30]   Chemical substructures that enrich for biological activity [J].
Klekota, Justin ;
Roth, Frederick P. .
BIOINFORMATICS, 2008, 24 (21) :2518-2525