Analysis of high-throughput screening assays using cluster enrichment

被引:12
作者
Pu, Minya [1 ]
Hayashi, Tomoko [1 ]
Cottam, Howard [1 ]
Mulvaney, Joseph [2 ]
Arkin, Michelle [2 ]
Corr, Maripat [1 ]
Carson, Dennis [1 ]
Messer, Karen [1 ]
机构
[1] Univ Calif San Diego, Moores Canc Ctr, La Jolla, CA 92093 USA
[2] Univ Calif San Francisco, Small Mol Discovery Ctr, San Francisco, CA 94158 USA
关键词
high-throughput screening; hit selection; cluster analysis; Murcko fragments; fingerprint descriptors; top X; HTS hit selection; SUPPORT VECTOR MACHINES; FALSE DISCOVERY RATE; SCALE RNAI SCREENS; HIT SELECTION;
D O I
10.1002/sim.5455
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
In this paper, we describe the implementation and evaluation of a cluster-based enrichment strategy to call hits from a high-throughput screen using a typical cell-based assay of 160,000 chemical compounds. Our focus is on statistical properties of the prospective design choices throughout the analysis, including how to choose the number of clusters for optimal power, the choice of test statistic, the significance thresholds for clusters and the activity threshold for candidate hits, how to rank selected hits for carry-forward to the confirmation screen, and how to identify confirmed hits in a data-driven manner. Whereas previously the literature has focused on choice of test statistic or chemical descriptors, our studies suggest that cluster size is the more important design choice. We recommend clusters to be ranked by enrichment odds ratio, not by p-value. Our conceptually simple test statistic is seen to identify the same set of hits as more complex scoring methods proposed in the literature do. We prospectively confirm that such a cluster-based approach can outperform the naive top X approach and estimate that we improved confirmation rates by about 31.5% from 813 using the top X approach to 1187 using our cluster-based method. Copyright (c) 2012 John Wiley & Sons, Ltd.
引用
收藏
页码:4175 / 4189
页数:15
相关论文
共 28 条
  • [1] Accelrys Software Inc, 2010, CHEM COLL BAS CHEM U
  • [2] [Anonymous], 2017, ELEMENTS STAT LEARNI
  • [3] The properties of known drugs .1. Molecular frameworks
    Bemis, GW
    Murcko, MA
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 1996, 39 (15) : 2887 - 2893
  • [4] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [5] An ultraefficient affinity-based high-throughout screening process: Application to bacterial cell wall biosynthesis enzyme MurF
    Comess, Kenneth M.
    Schurdak, Mark E.
    Voorbach, Martin J.
    Coen, Michael
    Trumbull, Jonathan D.
    Yang, Houjun
    Gao, Lan
    Tang, Hua
    Cheng, Xueheng
    Lerner, Claude G.
    McCall, Owen
    Burns, David J.
    Beutel, Bruce A.
    [J]. JOURNAL OF BIOMOLECULAR SCREENING, 2006, 11 (07) : 743 - 754
  • [6] Biological spectra analysis: Linking biological activity profiles to molecular structure
    Fliri, AF
    Loging, WT
    Thadeio, PF
    Volkmann, RA
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (02) : 261 - 266
  • [7] Enrichment of high-throughput screening data with increasing levels of noise using support vector machines, recursive partitioning, and Laplacian-modified naive Bayesian classifiers
    Glick, M
    Jenkins, JL
    Nettles, JH
    Hitchings, H
    Davies, JW
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2006, 46 (01) : 193 - 200
  • [8] Grün B, 2008, J STAT SOFTW, V28, P1
  • [9] A support vector machines approach for virtual screening of active compounds of single and multiple mechanisms from large libraries at an improved hit-rate and enrichment factor
    Han, L. Y.
    Ma, X. H.
    Lin, H. H.
    Jia, J.
    Zhu, F.
    Xue, Y.
    Li, Z. R.
    Cao, Z. W.
    Ji, Z. L.
    Chen, Y. Z.
    [J]. JOURNAL OF MOLECULAR GRAPHICS & MODELLING, 2008, 26 (08) : 1276 - 1286
  • [10] Harper G, 2006, DRUG DISCOV TODAY, V11, P694