Identification of diverse database subsets using property-based and fragment-based molecular descriptions

被引:76
作者
Ashton, M
Barnard, J
Casset, F
Charlton, M
Downs, G
Gorse, D
Holliday, J
Lahana, R
Willett, P
机构
[1] Evotec OAI, Abingdon OX14 4SD, Oxon, England
[2] Barnard Chem Informat Ltd, Sheffield S6 6BX, S Yorkshire, England
[3] Syst Parc Sci, F-30000 Nimes, France
[4] Univ Sheffield, Krebs Inst Biomol Res, Sheffield S10 2TN, S Yorkshire, England
[5] Univ Sheffield, Dept Informat Studies, Sheffield S10 2TN, S Yorkshire, England
来源
QUANTITATIVE STRUCTURE-ACTIVITY RELATIONSHIPS | 2002年 / 21卷 / 06期
关键词
diversity; molecular diversity analysis; structural diversity; subset selection;
D O I
10.1002/qsar.200290002
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
This paper reports a comparison of calculated molecular properties and of 2D fragment bit-strings when used for the selection of structurally diverse subsets of a file of 44295 compounds. MaxMin dissimilarity-based selection and k-means cluster-based selection are used to select subsets containing between 1% and 20% of the file. Investigation of the numbers of bioactive molecules in the selected subsets suggest: that the MaxMin subsets are noticeably superior to the k-means subsets; that the property-based descriptors are marginally superior to the fragment-based descriptors and that both approaches are noticeably superior to random selection.
引用
收藏
页码:598 / 604
页数:7
相关论文
共 21 条
[1]   Strategies for subset selection of parts of an in-house chemical library [J].
Andersson, PM ;
Sjöström, M ;
Wold, S ;
Lundstedt, T .
JOURNAL OF CHEMOMETRICS, 2001, 15 (04) :353-369
[2]   Chemical fragment generation and clustering software [J].
Barnard, JM ;
Downs, GM .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1997, 37 (01) :141-142
[3]   CLUSTERING OF CHEMICAL STRUCTURES ON THE BASIS OF 2-DIMENSIONAL SIMILARITY MEASURES [J].
BARNARD, JM ;
DOWNS, GM .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1992, 32 (06) :644-649
[4]   Molecular diversity and representativity in chemical databases [J].
Bayada, DM ;
Hamersma, H ;
van Geerestein, VJ .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1999, 39 (01) :1-10
[6]   Use of structure Activity data to compare structure-based clustering methods and descriptors for use in compound selection [J].
Brown, RD ;
Martin, YC .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1996, 36 (03) :572-584
[7]  
Brown RD, 1997, PERSPECT DRUG DISCOV, V7-8, P31
[8]  
DEAN PM, 1999, MOL DIVERSITY DRUG D
[9]  
GHOSE AK, 2001, COMBINATORIAL LIB DE
[10]   Molecular diversity and its analysis [J].
Gorse, D ;
Rees, A ;
Kaczorek, M ;
Lahana, R .
DRUG DISCOVERY TODAY, 1999, 4 (06) :257-264