A similarity-based data-fusion approach to the visual characterization and comparison of compound databases

被引:59
作者
Medina-Franco, Jose L.
Maggiora, Gerald M.
Giulianotti, Marc A.
Pinilla, Clemencia
Houghten, Richard A.
机构
[1] Univ Arizona, Coll Pharm, BIO5 Inst, Tucson, AZ 85721 USA
[2] Torrey Pines Inst Mol Studies, San Diego, CA 92121 USA
关键词
combinatorial libraries; compound acquisition; compound selection; data visualization; diversity analysis; fusion-based similarity; ligand-based virtual screening; multi-fusion similarity maps;
D O I
10.1111/j.1747-0285.2007.00579.x
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A low-dimensional method, based on the use of multiple fusion-based similarity measures, is described for graphically depicting and characterizing relationships among molecules in compound databases. The measures are used to construct multi-fusion similarity maps that characterize the relationship of a set of 'test' molecules to a set of 'reference' molecules. The reference set is very general and can be made of molecules from, for example, the set of test molecules itself (the self-referencing case), from a small library or large compound collection, or from actives in a given assay or group of assays. The test set is any collection of compounds to be analyzed with respect to the specified reference set. Multiple fusion similarity measures tend to provide more information than single fusion-based measures, including information on the nature of the chemical-space neighborhoods surrounding reference-set molecules. A general discussion is presented on how to interpret multi-fusion similarity maps, and several examples are given that illustrate how these maps can be used to compare compound libraries or collections, to select compounds for screening or acquisition, and to identify new active molecules using ligand-based virtual screening.
引用
收藏
页码:393 / 412
页数:20
相关论文
共 62 条
[1]   Stochastic proximity embedding [J].
Agrafiotis, DK .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 2003, 24 (10) :1215-1221
[2]   Multiobjective optimization of combinatorial libraries [J].
Agrafiotis, DK .
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2002, 16 (5-6) :335-356
[3]   A fractal approach for selecting an appropriate bin size for cell-based diversity estimation [J].
Agrafiotis, DK ;
Rassokhin, DN .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2002, 42 (01) :117-122
[4]   On the use of information theory for assessing molecular diversity [J].
Agrafiotis, DK .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1997, 37 (03) :576-580
[5]  
ALEKSANDROV AD, 1986, MATH ITS CONTENT MET, V3
[6]  
Baldi Pierre, 2005, Genome Inform, V16, P281
[7]   A hierarchical clustering approach for large compound libraries [J].
Böcker, A ;
Derksen, S ;
Schmidt, E ;
Teckentrup, A ;
Schneider, G .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2005, 45 (04) :807-815
[8]   NIPALSTREE:: A new hierarchical clustering approach for large compound libraries and its application to virtual screening [J].
Boecker, Alexander ;
Schneider, Gisbert ;
Teckentrup, Andreas .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2006, 46 (06) :2220-2229
[9]  
Borg I., 1997, MODERN MULTIDIMENSIO
[10]   Development of a spectral clustering method for the analysis of molecular data sets [J].
Brewer, Mark L. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2007, 47 (05) :1727-1733