WinBinVec: Cancer-Associated Protein-Protein Interaction Extraction and Identification of 20 Various Cancer Types and Metastasis Using Different Deep Learning Models

被引:8
作者
Abdollahi, Sina [1 ]
Lin, Peng-Chan [2 ]
Chiang, Jung-Hsien [3 ]
机构
[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 701, Taiwan
[2] Natl Cheng Kung Univ, Dept Internal Med, Tainan 701, Taiwan
[3] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Inst Med Informat, Tainan 701, Taiwan
关键词
Proteins; Cancer; Feature extraction; Amino acids; Bioinformatics; Protein engineering; Deep learning; Cancer-associated PPIs; cancer type prediction; metastasis prediction; mutation accumulation; deep learning; BINDING-AFFINITY; VARIANTS; PROFILES;
D O I
10.1109/JBHI.2021.3093441
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Biophysical protein-protein interactions perform dominant roles in the initiation and progression of many cancer-related pathways. A protein-protein interaction might play different roles in diverse cancer types. Hence, prioritizing the PPIs in each cancer type would help detect cancer-associated pathways, find a better understanding of cancer biology, and facilitate drug discovery. Several studies to date have proposed computational methods for extracting the PPI essentiality of different cancer types based on the PPI network. The main drawback of these studies is not using a rich source such as genomics variant data. An amino acid sequence encodes useful information about protein structure and behavior. We represent each amino acid sequence based on its variants/mutations in seven different ways: binary vectors, pathogenicity scores, binding affinity changes upon mutations, gene expression-based network of the interactions, biophysicochemical properties, g-gap dipeptide, and one-hot vectors. Based on these representations, we design and consider seven different deep learning models. Then, we compare the accuracy of these models in predicting 20 different cancer types from the TCGA cohort. WinBinVec is a window-based model that outperforms the other models. Moreover, WinBinVec contains a PPI essentiality module that helps extract the essentiality probability of each PPI for every cancer type. Source code and Data: https://github.com/sabdollahi/WinBinVec.
引用
收藏
页码:4052 / 4063
页数:12
相关论文
共 38 条
[1]   Precise uncertain significance prediction using latent space matrix factorization models: genomics variant and heterogeneous clinical data-driven approaches [J].
Abdollahi, Sina ;
Lin, Peng-Chan ;
Shen, Meng-Ru ;
Chiang, Jung-Hsien .
BRIEFINGS IN BIOINFORMATICS, 2021, 22 (04)
[2]   Unified rational protein engineering with sequence-based deep representation learning [J].
Alley, Ethan C. ;
Khimulya, Grigory ;
Biswas, Surojit ;
AlQuraishi, Mohammed ;
Church, George M. .
NATURE METHODS, 2019, 16 (12) :1315-+
[3]   Flex ddG: Rosetta Ensemble-Based Estimation of Changes in Protein-Protein Binding Affinity upon Mutation [J].
Barlow, Kyle A. ;
Conchuir, Shane O. ;
Thompson, Samuel ;
Suresh, Pooja ;
Lucas, James E. ;
Heinonen, Markus ;
Kortemme, Tanja .
JOURNAL OF PHYSICAL CHEMISTRY B, 2018, 122 (21) :5389-5399
[4]  
Bepler T., 2019, INT C LEARNING REPRE
[5]   De novo prediction of cancer-associated T cell receptors for noninvasive cancer detection [J].
Beshnova, Daria ;
Ye, Jianfeng ;
Onabolu, Oreoluwa ;
Moon, Benjamin ;
Zheng, Wenxin ;
Fu, Yang-Xin ;
Brugarolas, James ;
Lea, Jayanthi ;
Li, Bo .
SCIENCE TRANSLATIONAL MEDICINE, 2020, 12 (557)
[6]   BLAST plus : architecture and applications [J].
Camacho, Christiam ;
Coulouris, George ;
Avagyan, Vahram ;
Ma, Ning ;
Papadopoulos, Jason ;
Bealer, Kevin ;
Madden, Thomas L. .
BMC BIOINFORMATICS, 2009, 10
[7]   iFeature: a Python']Python package and web server for features extraction and selection from protein and peptide sequences [J].
Chen, Zhen ;
Zhao, Pei ;
Li, Fuyi ;
Leier, Andre ;
Marquez-Lago, Tatiana T. ;
Wang, Yanan ;
Webb, Geoffrey I. ;
Smith, A. Ian ;
Daly, Roger J. ;
Chou, Kuo-Chen ;
Song, Jiangning .
BIOINFORMATICS, 2018, 34 (14) :2499-2502
[8]   Systematic investigation of genetic vulnerabilities across cancer cell lines reveals lineage-specific dependencies in ovarian cancer [J].
Cheung, Hiu Wing ;
Cowley, Glenn S. ;
Weir, Barbara A. ;
Boehm, Jesse S. ;
Rusin, Scott ;
Scott, Justine A. ;
East, Alexandra ;
Ali, Levi D. ;
Lizotte, Patrick H. ;
Wong, Terence C. ;
Jiang, Guozhi ;
Hsiao, Jessica ;
Mermel, Craig H. ;
Getz, Gad ;
Barretina, Jordi ;
Gopal, Shuba ;
Tamayo, Pablo ;
Gould, Joshua ;
Tsherniak, Aviad ;
Stransky, Nicolas ;
Luo, Biao ;
Ren, Yin ;
Drapkin, Ronny ;
Bhatia, Sangeeta N. ;
Mesirov, Jill P. ;
Garraway, Levi A. ;
Meyerson, Matthew ;
Lander, Eric S. ;
Root, David E. ;
Hahn, William C. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 (30) :12372-12377
[9]   Deep Learning in Protein Structural Modeling and Design [J].
Gao, Wenhao ;
Mahajan, Sai Pooja ;
Sulam, Jeremias ;
Gray, Jeffrey J. .
PATTERNS, 2020, 1 (09)
[10]  
Guda Purnima, 2009, Genomics Proteomics & Bioinformatics, V7, P25, DOI 10.1016/S1672-0229(08)60030-3