Identification of diverse database subsets using property-based and fragment-based molecular descriptions

被引:71
作者
Ashton, M
Barnard, J
Casset, F
Charlton, M
Downs, G
Gorse, D
Holliday, J
Lahana, R
Willett, P
机构
[1] Evotec OAI, Abingdon OX14 4SD, Oxon, England
[2] Barnard Chem Informat Ltd, Sheffield S6 6BX, S Yorkshire, England
[3] Syst Parc Sci, F-30000 Nimes, France
[4] Univ Sheffield, Krebs Inst Biomol Res, Sheffield S10 2TN, S Yorkshire, England
[5] Univ Sheffield, Dept Informat Studies, Sheffield S10 2TN, S Yorkshire, England
来源
QUANTITATIVE STRUCTURE-ACTIVITY RELATIONSHIPS | 2002年 / 21卷 / 06期
关键词
diversity; molecular diversity analysis; structural diversity; subset selection;
D O I
10.1002/qsar.200290002
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
This paper reports a comparison of calculated molecular properties and of 2D fragment bit-strings when used for the selection of structurally diverse subsets of a file of 44295 compounds. MaxMin dissimilarity-based selection and k-means cluster-based selection are used to select subsets containing between 1% and 20% of the file. Investigation of the numbers of bioactive molecules in the selected subsets suggest: that the MaxMin subsets are noticeably superior to the k-means subsets; that the property-based descriptors are marginally superior to the fragment-based descriptors and that both approaches are noticeably superior to random selection.
引用
收藏
页码:598 / 604
页数:7
相关论文
共 21 条
  • [1] Strategies for subset selection of parts of an in-house chemical library
    Andersson, PM
    Sjöström, M
    Wold, S
    Lundstedt, T
    [J]. JOURNAL OF CHEMOMETRICS, 2001, 15 (04) : 353 - 369
  • [2] Chemical fragment generation and clustering software
    Barnard, JM
    Downs, GM
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1997, 37 (01): : 141 - 142
  • [3] CLUSTERING OF CHEMICAL STRUCTURES ON THE BASIS OF 2-DIMENSIONAL SIMILARITY MEASURES
    BARNARD, JM
    DOWNS, GM
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1992, 32 (06): : 644 - 649
  • [4] Molecular diversity and representativity in chemical databases
    Bayada, DM
    Hamersma, H
    van Geerestein, VJ
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1999, 39 (01): : 1 - 10
  • [6] Use of structure Activity data to compare structure-based clustering methods and descriptors for use in compound selection
    Brown, RD
    Martin, YC
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1996, 36 (03): : 572 - 584
  • [7] Brown RD, 1997, PERSPECT DRUG DISCOV, V7-8, P31
  • [8] DEAN PM, 1999, MOL DIVERSITY DRUG D
  • [9] GHOSE AK, 2001, COMBINATORIAL LIB DE
  • [10] Molecular diversity and its analysis
    Gorse, D
    Rees, A
    Kaczorek, M
    Lahana, R
    [J]. DRUG DISCOVERY TODAY, 1999, 4 (06) : 257 - 264