Efficient Substructure Searching of Large Chemical Libraries: The ABCD Chemical Cartridge

被引:19
作者
Agrafiotis, Dimitris K. [1 ]
Lobanov, Victor S. [1 ]
Shemanarev, Maxim [1 ]
Rassokhin, Dmitrii N. [1 ]
Izrailev, Sergei [1 ]
Jaeger, Edward P. [1 ]
Alex, Simson [1 ]
Farnum, Michael [1 ]
机构
[1] Johnson & Johnson Pharmaceut Res & Dev LLC, Spring House, PA 19477 USA
关键词
ALGORITHM; FINGERPRINTS; STORAGE;
D O I
10.1021/ci200413e
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Efficient substructure searching is a key requirement for any chemical information management system. In this paper, we describe the substructure search capabilities of ABCD, an integrated drug discovery informatics platform developed at Johnson & Johnson Pharmaceutical Research & Development, L.L.C. The solution consists of several algorithmic components: 1) a pattern mapping algorithm for solving the subgraph isomorphism problem, 2) an indexing scheme that enables very fast substructure searches on large structure files, 3) the incorporation of that indexing scheme into an Oracle cartridge to enable querying large relational databases through SQL, and 4) a cost estimation scheme that allows the Oracle cost-based optimizer to generate a good execution plan when a substructure search is combined with additional constraints in a single SQL query. The algorithm was tested on a public database comprising nearly 1 million molecules using 4,629 substructure queries, the vast majority of which were submitted by discovery scientists over the last 2.5 years of user acceptance testing of ABCD. 80.7% of these queries were completed in less than a second and 96.8% in less than ten seconds on a single CPU, while on eight processing cores these numbers increased to 93.2% and 99.7%, respectively. The slower queries involved extremely generic patterns that returned the entire database as screening hits and required extensive atom-by-atom verification.
引用
收藏
页码:3113 / 3130
页数:18
相关论文
共 48 条
  • [1] SAR maps: A new SAR visualization technique for medicinal chemists
    Agrafiotis, Dimitris K.
    Shemanarev, Maxim
    Connolly, Peter J.
    Farnum, Michael
    Lobanov, Victor S.
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 2007, 50 (24) : 5926 - 5937
  • [2] Advanced biological and chemical discovery (ABCD): Centralizing discovery knowledge in an inherently decentralized world
    Agrafiotis, Dimitris K.
    Alex, Simson
    Dai, Heng
    Derkinderen, An
    Farnum, Michael
    Gates, Peter
    Izrailev, Sergei
    Jaeger, Edward P.
    Konstant, Paul
    Leung, Albert
    Lobanov, Victor S.
    Marichal, Patrick
    Martin, Douglas
    Rassokhin, Dmitrii N.
    Shemanarev, Maxim
    Skalkin, Andrew
    Stong, John
    Tabruyn, Tom
    Vermeiren, Marleen
    Wan, Jackson
    Xu, Xiang Yang
    Yao, Xiang
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2007, 47 (06) : 1999 - 2014
  • [3] Radial clustergrams: Visualizing the aggregate properties of hierarchical clusters
    Agrafiotis, Dimitris K.
    Bandyopadhyay, Deepak
    Farnum, Michael
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2007, 47 (01) : 69 - 75
  • [4] Single R-Group Polymorphisms (SRPs) and R-Cliffs: An Intuitive Framework for Analyzing and Visualizing Activity Cliffs in a Single Analog Series
    Agrafiotis, Dimitris K.
    Wiener, John J. M.
    Skalkin, Andrew
    Kolpak, Jeremy
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2011, 51 (05) : 1122 - 1131
  • [5] Scaffold Explorer: An Interactive Tool for Organizing and Mining Structure-Activity Data Spanning Multiple Chemotypes
    Agrafiotis, Dimitris K.
    Wiener, John J. M.
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 2010, 53 (13) : 5002 - 5011
  • [6] Combinatorial informatics in the post-genomics era
    Agrafiotis, DK
    Lobanov, VS
    Salemme, FR
    [J]. NATURE REVIEWS DRUG DISCOVERY, 2002, 1 (05) : 337 - 346
  • [7] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [8] [Anonymous], 1987, J CHEMOM, DOI DOI 10.1002/CEM.1180010303
  • [9] SUBSTRUCTURE SYSTEMS - CONCEPTS AND CLASSIFICATIONS
    ATTIAS, R
    DUBOIS, JE
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1990, 30 (01): : 2 - 7
  • [10] DARC SUBSTRUCTURE SEARCH SYSTEM - A NEW APPROACH TO CHEMICAL INFORMATION
    ATTIAS, R
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1983, 23 (03): : 102 - 108