Analysis and use of fragment-occurrence data in similarity-based virtual screening

被引:22
作者
Arif, Shereena M. [1 ,2 ]
Holliday, John D. [1 ,2 ]
Willett, Peter [1 ,2 ]
机构
[1] Univ Sheffield, Krebs Inst Biomol Res, Sheffield S1 4DP, S Yorkshire, England
[2] Univ Sheffield, Dept Informat Studies, Sheffield S1 4DP, S Yorkshire, England
关键词
Fingerprint; Fragment occurrences; Ligand-based virtual screening; Similarity searching; Substructural fragment; Tanimoto coefficient; Virtual screening; Weighting scheme; MOLECULAR SIMILARITY; DESCRIPTORS; 2D; PREDICTION; SEARCH;
D O I
10.1007/s10822-009-9285-0
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Current systems for similarity-based virtual screening use similarity measures in which all the fragments in a fingerprint contribute equally to the calculation of structural similarity. This paper discusses the weighting of fragments on the basis of their frequencies of occurrence in molecules. Extensive experiments with sets of active molecules from the MDL Drug Data Report and the World of Molecular Bioactivity databases, using fingerprints encoding Tripos holograms, Pipeline Pilot ECFC_4 circular substructures and Sunset Molecular keys, demonstrate clearly that frequency-based screening is generally more effective than conventional, unweighted screening. The results suggest that standardising the raw occurrence frequencies by taking the square root of the frequencies will maximise the effectiveness of virtual screening. An upper-bound analysis shows the complex interactions that can take place between representations, weighting schemes and similarity coefficients when similarity measures are computed, and provides a rationalisation of the relative performance of the various weighting schemes.
引用
收藏
页码:655 / 668
页数:14
相关论文
共 51 条
  • [1] [Anonymous], NAT REV DRUG DISCOV
  • [2] One- to four-dimensional kernels for virtual screening and the prediction of physical, chemical, and biological properties
    Azencott, Chloe-Agathe
    Ksikes, Alexandre
    Swamidass, S. Joshua
    Chen, Jonathan H.
    Ralaivola, Liva
    Baldi, Pierre
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2007, 47 (03) : 965 - 974
  • [3] The properties of known drugs .1. Molecular frameworks
    Bemis, GW
    Murcko, MA
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 1996, 39 (15) : 2887 - 2893
  • [4] Similarity searching of chemical databases using atom environment descriptors (MOLPRINT 2D): Evaluation of performance
    Bender, A
    Mussa, HY
    Glen, RC
    Reiling, S
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (05): : 1708 - 1718
  • [5] Bohm Hans-Joachim, 2004, Drug Discov Today Technol, V1, P217, DOI 10.1016/j.ddtec.2004.10.009
  • [6] Descriptor collision and confusion: Toward the design of descriptors to mask chemical structures
    Bologa, C
    Allu, TK
    Olah, M
    Kappler, MA
    Oprea, TI
    [J]. JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2005, 19 (9-10) : 625 - 635
  • [7] On scaffolds and hopping in medicinal chemistry
    Brown, Nathan
    Jacoby, Edgar
    [J]. MINI-REVIEWS IN MEDICINAL CHEMISTRY, 2006, 6 (11) : 1217 - 1229
  • [8] Use of structure Activity data to compare structure-based clustering methods and descriptors for use in compound selection
    Brown, RD
    Martin, YC
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1996, 36 (03): : 572 - 584
  • [9] Performance of similarity measures in 2D fragment-based similarity searching: Comparison of structural descriptors and similarity coefficients
    Chen, X
    Reynolds, CH
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2002, 42 (06): : 1407 - 1414
  • [10] Effect of Data Standardization on Chemical Clustering and Similarity Searching
    Chu, Chia-Wei
    Holliday, John D.
    Willett, Peter
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2009, 49 (02) : 155 - 161