A method for quantifying and visualizing the diversity of QSAR models

被引:19
作者
Izrailev, S [1 ]
Agrafiotis, DK [1 ]
机构
[1] 3 Dimens Pharmaceut Inc, Cranbury, NJ 08512 USA
关键词
stochastic proximity embedding; multi-dimensional scaling; nonlinear mapping; feature selection; point set similarity; quantitative structure-activity relationships; data mining;
D O I
10.1016/j.jmgm.2003.10.001
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Feature selection is one of the most commonly used and reliable methods for deriving predictive quantitative structure-activity relationships (QSAR). Many feature selection algorithms are stochastic in nature and often produce different solutions depending on the initialization conditions. Because some features may be highly correlated, models that are based on different sets of descriptors may capture essentially the same information, however, such models are difficult to recognize. Here, we introduce a measure of similarity between QSAR models that captures the correlation between the underlying features. This measure can be used in conjunction with stochastic proximity embedding (SPE) or multi-dimensional scaling (MDS) to create a meaningful visual representation of structure-activity model space and aid in the post-processing and analysis of results of feature selection calculations. (C) 2003 Elsevier Inc. All rights reserved.
引用
收藏
页码:275 / 284
页数:10
相关论文
共 28 条
[1]  
Agrafiotis DK, 1997, PROTEIN SCI, V6, P287
[2]   A self-organizing principle for learning nonlinear manifolds [J].
Agrafiotis, DK ;
Xu, HF .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (25) :15869-15872
[3]   Stochastic proximity embedding [J].
Agrafiotis, DK .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 2003, 24 (10) :1215-1221
[4]   Feature selection for structure-activity correlation using binary particle swarms [J].
Agrafiotis, DK ;
Cedeño, W .
JOURNAL OF MEDICINAL CHEMISTRY, 2002, 45 (05) :1098-1107
[5]   On the use of neural network ensembles in QSAR and QSPR [J].
Agrafiotis, DK ;
Cedeño, W ;
Lobanov, VS .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2002, 42 (04) :903-911
[6]  
AGRAFIOTIS DK, 1997, Patent No. 5684711
[7]  
AGRAFIOTIS DK, 1999, Patent No. 5901069
[8]  
AGRAFIOTIS DK, 1996, Patent No. 5574656
[9]  
Borg I., 1997, Modern Multidimensional Scaling
[10]  
Coppersmith D, 1999, RANDOM STRUCT ALGOR, V15, P113, DOI 10.1002/(SICI)1098-2418(199909)15:2<113::AID-RSA1>3.0.CO