OctSurf: Efficient hierarchical voxel-based molecular surface representation for protein-ligand affinity prediction

被引:22
作者
Liu, Qinqing [1 ]
Wang, Peng-Shuai [2 ]
Zhu, Chunjiang [1 ]
Gaines, Blake Blumenfeld [1 ]
Zhu, Tan [1 ]
Bi, Jinbo [1 ,3 ]
Song, Minghu [3 ]
机构
[1] Univ Connecticut, Dept Comp Sci & Engn, Storrs, CT 06279 USA
[2] Microsoft Res Asia, Beijing, Peoples R China
[3] Univ Connecticut, Dept Biomed Engn, Storrs, CT 06279 USA
关键词
Protein-Ligand affinity prediction; Convolution neural networks; 3D volumetric representation; Octree; Molecular surface;
D O I
10.1016/j.jmgm.2021.107865
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Voxel-based 3D convolutional neural networks (CNNs) have been applied to predict protein-ligand binding affinity. However, the memory usage and computation cost of these voxel-based approaches increase cubically with respect to spatial resolution and sometimes make volumetric CNNs intractable at higher resolutions. Therefore, it is necessary to develop memory-efficient alternatives that can accelerate the convolutional operation on 3D volumetric representations of the protein-ligand interaction. In this study, we implement a novel volumetric representation, OctSurf, to characterize the 3D molecular surface of protein binding pockets and bound ligands. The OctSurf surface representation is built based on the octree data structure, which has been widely used in computer graphics to efficiently represent and store 3D object data. Vanilla 3D-CNN approaches often divide the 3D space of objects into equal-sized voxels. In contrast, OctSurf recursively partitions the 3D space containing the protein-ligand pocket into eight subspaces called octants. Only those octants containing van der Waals surface points of protein or ligand atoms undergo the recursive subdivision process until they reach the predefined octree depth, whereas unoccupied octants are kept intact to reduce the memory cost. Resulting non-empty leaf octants approximate molecular surfaces of the protein pocket and bound ligands. These surface octants, along with their chemical and geometric features, are used as the input to 3D-CNNs. Two kinds of CNN architectures, VGG and ResNet, are applied to the OctSurf representation to predict binding affinity. The OctSurf representation consumes much less memory than the conventional voxel representation at the same resolution. By restricting the convolution operation to only octants of the smallest size, our method also alleviates the overall computational overhead of CNN. A series of experiments are performed to demonstrate the disk storage and computational efficiency of the proposed learning method. Our code is available at the following GitHub repository: https://github.uconn.edu/mldrugdiscovery/OctSurf. ? 2021 Elsevier Inc. All rights reserved.
引用
收藏
页数:14
相关论文
共 51 条
[1]  
[Anonymous], 2015, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2015.7298801
[2]   Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening [J].
Chen, Lieyang ;
Cruz, Anthony ;
Ramsey, Steven ;
Dickson, Callum J. ;
Duca, Jose S. ;
Hornak, Viktor ;
Koes, David R. ;
Kurtzman, Tom .
PLOS ONE, 2019, 14 (08)
[3]  
Cheung GKM, 2000, PROC CVPR IEEE, P714, DOI 10.1109/CVPR.2000.854944
[4]  
CRASSIN C., 2009, P 2009 S INT 3D GRAP, P15, DOI [DOI 10.1145/1507149.1507152, 10.1145/1507149.1507152]
[5]   Antibody interface prediction with 3D Zernike descriptors and SVM [J].
Daberdaku, Sebastian ;
Ferrari, Carlo .
BIOINFORMATICS, 2019, 35 (11) :1870-1876
[6]   THE DOUBLE CUBIC LATTICE METHOD - EFFICIENT APPROACHES TO NUMERICAL-INTEGRATION OF SURFACE-AREA AND VOLUME AND TO DOT SURFACE CONTOURING OF MOLECULAR ASSEMBLIES [J].
EISENHABER, F ;
LIJNZAAD, P ;
ARGOS, P ;
SANDER, C ;
SCHARF, M .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 1995, 16 (03) :273-284
[7]   Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning [J].
Gainza, P. ;
Sverrisson, F. ;
Monti, F. ;
Rodola, E. ;
Boscaini, D. ;
Bronstein, M. M. ;
Correia, B. E. .
NATURE METHODS, 2020, 17 (02) :184-+
[8]  
Graham B., 2015, BRIT MACH VIS C
[9]   3D Semantic Segmentation with Submanifold Sparse Convolutional Networks [J].
Graham, Benjamin ;
Engelcke, Martin ;
van der Maaten, Laurens .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :9224-9232
[10]  
Hassan-Harrirou H., 2020, J CHEM INF MODEL