Calculation of Molecular Lipophilicity: State-of-the-Art and Comparison of Log P Methods on More Than 96,000 Compounds

被引:493
作者
Mannhold, Raimund [2 ]
Poda, Gennadiy I. [3 ]
Ostermann, Claude [4 ]
Tetko, Igor V. [1 ,5 ]
机构
[1] German Res Ctr Environm Hlth GmbH, Helmholtz Zentrum Munchen, Inst Bioinformat & Syst Biol, D-85764 Neuherberg, Germany
[2] Univ Dusseldorf, Mol Drug Res Grp, D-40225 Dusseldorf, Germany
[3] Pfizer Global R&D, Chesterfield, MO 63017 USA
[4] Nycomed GmbH, D-78467 Constance, Germany
[5] Natl Acad Sci Ukraine, Inst Bioorgan & Petrochem, UA-02660 Kiev, Ukraine
关键词
lipophilicity; log P calculation; substructure-based approaches; fragmental methods; atom-based methods; property-based approaches; methods based on 3D-structure representation; empirical approaches; quantum chemical semi-empirical calculations; continuum solvation models; molecular dynamics calculations; molecular lipophilicity potential; lattice energy calculations; topological descriptors; graph molecular connectivity; electrotopological-state (E-state) descriptors; consensus model; OCTANOL-WATER PARTITION; ATOMIC PHYSICOCHEMICAL PARAMETERS; HYDROPHOBIC FRAGMENTAL CONSTANT; SOLVATION ENERGY RELATIONSHIPS; NEURAL-NETWORK; ORGANIC-COMPOUNDS; AQUEOUS SOLUBILITY; SURFACE-AREA; PREDICTION; COEFFICIENTS;
D O I
10.1002/jps.21494
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
We first review the state-of-the-art in development of log P prediction approaches falling in two major categories: substructure-based and property-based methods. Then, we compare the predictive power of representative methods for one public (N = 266) and two in house datasets from Nycomed (N = 882) and Pfizer (N = 95809). A total of 30 and 18 methods were tested for public and industrial datasets, respectively. Accuracy of models declined with the number of nonhydrogen atoms. The Arithmetic Average Model (AAM), which predicts the same value (the arithmetic mean) for all compounds, was used as a baseline model for comparison. Methods with Root Mean Squared Error (RMSE) greater than RMSE produced by the AAM were considered as unacceptable. The majority of analyzed methods produced reasonable results for the public dataset but only seven methods were successful on the both in house datasets. We proposed a simple equation based on the number of carbon atoms, NC, and the number of hetero atoms, NHET: log P = 1.46(+/- 0.02) + 0.11(+0.001) NC-0.11(+/- 0.001) NHET. This equation outperformed a large number of programs benchmarked in this Study. Factors influencing the accuracy of log P predictions were elucidated and discussed. (C) 2008 Wiley-Liss, Inc. and the American Pharmacists Association J Pharm Sci 98:861-893, 2009
引用
收藏
页码:861 / 893
页数:33
相关论文
共 169 条
[1]   EXTENSION OF THE FRAGMENT METHOD TO CALCULATE AMINO-ACID ZWITTERION AND SIDE-CHAIN PARTITION-COEFFICIENTS [J].
ABRAHAM, DJ ;
LEO, AJ .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1987, 2 (02) :130-152
[3]   HYDROGEN-BONDING .32. AN ANALYSIS OF WATER-OCTANOL AND WATER-ALKANE PARTITIONING AND THE DELTA-LOG-P PARAMETER OF SEILER [J].
ABRAHAM, MH ;
CHADHA, HS ;
WHITING, GS ;
MITCHELL, RC .
JOURNAL OF PHARMACEUTICAL SCIENCES, 1994, 83 (08) :1085-1100
[4]   Mapping the energetics of water-protein and water-ligand interactions with the "natural" HINT forcefield: Predictive tools for characterizing the roles of water in biomolecules [J].
Amadasi, A ;
Spyrakis, F ;
Cozzini, P ;
Abraham, DJ ;
Kellogg, GE ;
Mozzarelli, A .
JOURNAL OF MOLECULAR BIOLOGY, 2006, 358 (01) :289-309
[5]  
[Anonymous], 2000, WILEY VCH
[6]  
[Anonymous], MOL STRUCTURE DESCRI
[7]  
[Anonymous], 1970, PHYS ORGANIC CHEM
[8]  
[Anonymous], ADMET PRED TM VERS 2
[9]  
Atkinson F., 2002, Current Medicinal Chemistry - Central Nervous System Agents, V2, P229, DOI 10.2174/1568015023358058
[10]  
Audry E, 1989, Prog Clin Biol Res, V291, P63