A Machine Learning Approach to Weighting Schemes in the Data Fusion of Similarity Coefficients

被引:12
作者
Chen, Jenny [1 ]
Holliday, John [1 ]
Bradshaw, John [2 ]
机构
[1] Univ Sheffield, Dept Informat Studies, Sheffield S1 4DP, S Yorkshire, England
[2] Daylight Chem Informat Syst Inc, Aliso Viejo, CA 92656 USA
关键词
CHEMICAL SIMILARITY; MOLECULAR SIMILARITY; CHEMOINFORMATICS; DISSIMILARITY; FINGERPRINTS; COMBINATION; STRINGS; SIZE;
D O I
10.1021/ci800292d
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
The application of data fusion techniques for combining the results of similarity searches of chemical databases has been shown to improve search performance. When used to combine the results of searches using different similarity coefficients, the optimum combination is dependent on the size, in terms of substructural fragments present, of the molecules being compared. This paper describes preliminary simulation tests which aim to automatically deduce, using machine learning techniques, the optimum combination of similarity coefficient which may be combined using data fusion for a given class of active compounds.
引用
收藏
页码:185 / 194
页数:10
相关论文
共 25 条
[1]  
*ACC SOFTW INC, SCIT PLATF
[2]  
Alvarez J., 2005, VIRTUAL SCREENING DR
[3]  
[Anonymous], 1999, Sensor and data fusion concepts and applications
[4]  
[Anonymous], 1990, M 196 1988 LOS ANG C
[5]   The hidden component of size in two-dimensional fragment descriptors: Side effects on sampling in bioactive libraries [J].
Dixon, SL ;
Koehler, RT .
JOURNAL OF MEDICINAL CHEMISTRY, 1999, 42 (15) :2887-2900
[6]   A modification of the Jaccard-Tanimoto similarity index for diverse selection of chemical compounds using binary strings [J].
Fligner, MA ;
Verducci, JS ;
Blower, PE .
TECHNOMETRICS, 2002, 44 (02) :110-119
[7]   On the properties of bit string-based measures of chemical similarity [J].
Flower, DR .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1998, 38 (03) :379-386
[8]   Combination of molecular similarity measures using data fusion [J].
Ginn, CMR ;
Willett, P ;
Bradshaw, J .
PERSPECTIVES IN DRUG DISCOVERY AND DESIGN, 2000, 20 (01) :1-16
[9]   Combinatorial preferences affect molecular similarity/diversity calculations using binary fingerprints and Tanimoto coefficients [J].
Godden, JW ;
Xue, L ;
Bajorath, J .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2000, 40 (01) :163-166
[10]  
Hall D., 1992, MATH TECHNIQUES MULT