Ligand biological activity predicted by cleaning positive and negative chemical correlations

被引:23
作者
Lee, Alpha A. [1 ]
Yang, Qingyi [2 ]
Bassyouni, Asser [3 ]
Butler, Christopher R. [2 ]
Hou, Xinjun [2 ]
Jenkinson, Stephen [3 ]
Price, David A. [2 ]
机构
[1] Univ Cambridge, Cavendish Lab, Cambridge CB3 0HE, England
[2] Pfizer Inc, Med Design, Cambridge, MA 02139 USA
[3] Pfizer Inc, Drug Safety Res & Dev, San Diego, CA 92121 USA
关键词
random matrix theory; ligand-based drug discovery; bioactivity prediction; machine learning; chemoinformatics; MUSCARINIC ACETYLCHOLINE-RECEPTORS; DRUG DISCOVERY; QSAR; SETS; HERG;
D O I
10.1073/pnas.1810847116
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Predicting ligand biological activity is a key challenge in drug discovery. Ligand-based statistical approaches are often hampered by noise due to undersampling: The number of molecules known to be active or inactive is vastly less than the number of possible chemical features that might determine binding. We derive a statistical framework inspired by random matrix theory and combine the framework with high-quality negative data to discover important chemical differences between active and inactive molecules by disentangling undersampling noise. Our model outperforms standard benchmarks when tested against a set of challenging retrospective tests. We prospectively apply our model to the human muscarinic acetylcholine receptor M1, finding four experimentally confirmed agonists that are chemically dissimilar to all known ligands. The hit rate of our model is significantly higher than the state of the art. Our model can be interpreted and visualized to offer chemical insights about the molecular motifs that are synergistic or antagonistic to M1 agonism, which we have prospectively experimentally verified.
引用
收藏
页码:3373 / 3378
页数:6
相关论文
共 43 条
[1]  
Alvarez J., 2005, VIRTUAL SCREENING DR
[2]  
[Anonymous], RDKIT OPEN SOURCE CH
[3]  
[Anonymous], 2008, VIRTUAL SCREENING BI
[4]  
[Anonymous], 2017, ARXIV170906716
[5]  
[Anonymous], DRUG DISCOV TODAY
[6]   Modifications to five-substituted 3,3-diethyl-4,5-dihydro-2(3H)-furanones en route to novel muscarinic receptor ligands [J].
Bhandare, Richie R. ;
Canney, Daniel J. .
MEDICINAL CHEMISTRY RESEARCH, 2011, 20 (05) :558-565
[7]   Tuning hERG Out: Antitarget QSAR Models for Drug Development [J].
Braga, Rodolpho C. ;
Alves, Vinicius M. ;
Silva, Meryck F. B. ;
Muratov, Eugene ;
Fourches, Denis ;
Tropsha, Alexander ;
Andrade, Carolina H. .
CURRENT TOPICS IN MEDICINAL CHEMISTRY, 2014, 14 (11) :1399-1415
[8]   Cleaning large correlation matrices: Tools from Random Matrix Theory [J].
Bun, Joel ;
Bouchaud, Jean-Philippe ;
Potters, Marc .
PHYSICS REPORTS-REVIEW SECTION OF PHYSICS LETTERS, 2017, 666 :1-109
[9]   Optimal HTS Fingerprint Definitions by Using a Desirability Function and a Genetic Algorithm [J].
Cabrera, Alvaro Cortes ;
Petrone, Paula M. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2018, 58 (03) :641-646
[10]   Structure-Based Virtual Screening for Drug Discovery: a Problem-Centric Review [J].
Cheng, Tiejun ;
Li, Qingliang ;
Zhou, Zhigang ;
Wang, Yanli ;
Bryant, Stephen H. .
AAPS JOURNAL, 2012, 14 (01) :133-141