Integration of element specific persistent homology and machine learning for protein-ligand binding affinity prediction

被引:127
作者
Cang, Zixuan [1 ]
Wei, Guo-Wei [1 ,2 ,3 ]
机构
[1] Michigan State Univ, Dept Math, E Lansing, MI 48824 USA
[2] Michigan State Univ, Dept Biochem & Mol Biol, E Lansing, MI 48824 USA
[3] Michigan State Univ, Dept Elect & Comp Engn, E Lansing, MI 48824 USA
基金
美国国家科学基金会;
关键词
protein-ligand binding affinity; machine learning; topology; EMPIRICAL SCORING FUNCTIONS; COMPUTATION; TOPOLOGY;
D O I
10.1002/cnm.2914
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Protein-ligand binding is a fundamental biological process that is paramount to many other biological processes, such as signal transduction, metabolic pathways, enzyme construction, cell secretion, and gene expression. Accurate prediction of protein-ligand binding affinities is vital to rational drug design and the understanding of protein-ligand binding and binding induced function. Existing binding affinity prediction methods are inundated with geometric detail and involve excessively high dimensions, which undermines their predictive power for massive binding data. Topology provides the ultimate level of abstraction and thus incurs too much reduction in geometric information. Persistent homology embeds geometric information into topological invariants and bridges the gap between complex geometry and abstract topology. However, it oversimplifies biological information. This work introduces element specific persistent homology (ESPH) or multicomponent persistent homology to retain crucial biological information during topological simplification. The combination of ESPH and machine learning gives rise to a powerful paradigm for macromolecular analysis. Tests on 2 large data sets indicate that the proposed topology-based machine-learning paradigm outperforms other existing methods in protein-ligand binding affinity predictions. ESPH reveals protein-ligand binding mechanism that can not be attained from other conventional techniques. The present approach reveals that protein-ligand hydrophobic interactions are extended to 40 angstrom away from the binding site, which has a significant ramification to drug and protein design.
引用
收藏
页数:17
相关论文
共 56 条
[1]  
Adams Henry, 2014, Mathematical Software - ICMS 2014. 4th International Congress. Proceedings. LNCS: 8592, P129, DOI 10.1007/978-3-662-44199-2_23
[2]  
[Anonymous], 2012, PERSEUS PERSISTENT H
[3]  
[Anonymous], 2015, Molecular based Mathematical Biologys
[4]  
Bauer U., 2014, Algorithm Engin. and Exp., P31, DOI [DOI 10.1137/1.9781611973198.41,5, DOI 10.1137/1.9781611973198.4]
[5]   Persistent Intersection Homology [J].
Bendich, Paul ;
Harer, John .
FOUNDATIONS OF COMPUTATIONAL MATHEMATICS, 2011, 11 (03) :305-336
[6]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[7]   Zigzag Persistence [J].
Carlsson, Gunnar ;
de Silva, Vin .
FOUNDATIONS OF COMPUTATIONAL MATHEMATICS, 2010, 10 (04) :367-405
[8]   Zigzag Persistent Homology and Real-valued Functions [J].
Carlsson, Gunnar ;
de Silva, Vin ;
Morozov, Dmitriy .
PROCEEDINGS OF THE TWENTY-FIFTH ANNUAL SYMPOSIUM ON COMPUTATIONAL GEOMETRY (SCG'09), 2009, :247-256
[9]   The Theory of Multidimensional Persistence [J].
Carlsson, Gunnar ;
Zomorodian, Afra .
DISCRETE & COMPUTATIONAL GEOMETRY, 2009, 42 (01) :71-93
[10]   Persistence-Based Clustering in Riemannian Manifolds [J].
Chazal, Frederic ;
Guibas, Leonidas J. ;
Oudot, Steve Y. ;
Skraba, Primoz .
JOURNAL OF THE ACM, 2013, 60 (06)