Design and evaluation of a molecular fingerprint involving the transformation of property descriptor values into a binary classification scheme

被引:73
作者
Xue, L
Godden, JW
Stahura, FL
Bajorath, J
机构
[1] Albany Mol Res Inc, Dept Comp Aided Drug Discovery, Bothell Res Ctr, Bothell, WA 98011 USA
[2] Univ Washington, Dept Biol Struct, Seattle, WA 98195 USA
来源
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES | 2003年 / 43卷 / 04期
关键词
D O I
10.1021/ci030285+
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
A new fingerprint design concept is introduced that transforms molecular property descriptors into two-state descriptors and thus permits binary encoding. This transformation is based on the calculation of statistical medians of descriptor distributions in large compound collections and alleviates the need for value range encoding of these descriptors. For binary encoded property descriptors, bit positions that are set off capture as much information as bit positions that are set on, different from conventional fingerprint representations. Accordingly, a variant of the Tanimoto coefficient has been defined for comparison of these fingerprints. Following our design idea, a prototypic fingerprint termed MP-MFP was implemented by combining 61 binary encoded property descriptors with 110 structural fragment-type descriptors. The performance of this fingerprint was evaluated in systematic similarity search calculations in a database containing 549 molecules belonging to 38 different activity classes and 5000 background molecules. In these calculations, MP-MFP correctly recognized similar to34% of all similarity relationships, with only 0.04% false positives, and performed better than previous designs and MACCS keys. The results suggest that combinations of simplified two-state property descriptors have predictive value in the analysis of molecular similarity.
引用
收藏
页码:1151 / 1157
页数:7
相关论文
共 38 条
  • [1] [Anonymous], MACCS KEYS
  • [2] Selected concepts and investigations in compound classification, molecular descriptor analysis, and virtual screening
    Bajorath, J
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2001, 41 (02): : 233 - 245
  • [3] Topological indices: Their nature and mutual relatedness
    Basak, SC
    Balaban, AT
    Grunwald, GD
    Gute, BD
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2000, 40 (04): : 891 - 898
  • [4] A rapid computational method for lead evolution:: Description and application to α1-adrenergic antagonists
    Bradley, EK
    Beroza, P
    Penzotti, JE
    Grootenhuis, PDJ
    Spellmeyer, DC
    Miller, JL
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 2000, 43 (14) : 2770 - 2774
  • [5] The information content of 2D and 3D structural descriptors relevant to ligand-receptor binding
    Brown, RD
    Martin, YC
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1997, 37 (01): : 1 - 9
  • [6] *CHEM COMP GROUP I, MOL OPE ENV VERS 200
  • [7] On the properties of bit string-based measures of chemical similarity
    Flower, DR
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1998, 38 (03): : 379 - 386
  • [8] ITERATIVE PARTIAL EQUALIZATION OF ORBITAL ELECTRONEGATIVITY - A RAPID ACCESS TO ATOMIC CHARGES
    GASTEIGER, J
    MARSILI, M
    [J]. TETRAHEDRON, 1980, 36 (22) : 3219 - 3228
  • [9] Combinatorial preferences affect molecular similarity/diversity calculations using binary fingerprints and Tanimoto coefficients
    Godden, JW
    Xue, L
    Bajorath, J
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2000, 40 (01): : 163 - 166
  • [10] Variability of molecular descriptors in compound databases revealed by Shannon entropy calculations
    Godden, JW
    Stahura, FL
    Bajorath, J
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2000, 40 (03): : 796 - 800