Protein deep profile and model predictions for identifying the causal genes of male infertility based on deep learning

被引:6
|
作者
Xu, Fang [1 ,4 ]
Guo, Ganggang [2 ]
Zhu, Feida [3 ]
Tan, Xiaojun [4 ]
Fan, Liqing [1 ]
机构
[1] Cent South Univ, Sch Basic Med Sci, Inst Reprod & Stem Cell Engn, Changsha 410000, Peoples R China
[2] Cent South Univ, Sch Business, Changsha 410000, Peoples R China
[3] Singapore Management Univ, Sch Informat Syst, Stamford Rd, Singapore 178902, Singapore
[4] Cent South Univ, Clin Practice Base, Xiangtan Cent Hosp, Xiangtan 411100, Peoples R China
基金
中国国家自然科学基金;
关键词
Data integration; Disease phenotype; Male infertility; Causal gene; Knowledge representation; Convolutional neural network; Manifold learning; Deep learning; GENOMIC DATA; DISEASE ASSOCIATIONS; CLINICAL VALIDITY; NETWORK; FUSION; CLASSIFICATION; SEQUENCE; HEALTH; OMICS; REPRESENTATION;
D O I
10.1016/j.inffus.2021.04.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A principal task in dissecting the genetics of complex traits is to identify causal genes for disease phenotypes. Millions of genes have been sequenced in data-driven genomics era, but their causal relationships with disease phenotypes remain limited, due to the difficulty of elucidating underlying causal genes by laboratory based strategies. Here, we proposed an innovative deep learning computational modeling alternative (DPPCG framework) for identifying causal (coding) genes for a specific disease phenotype. In terms of male infertility, we introduced proteins as intermediate cell variables, leveraging integrated deep knowledge representations (Word2vec, ProtVec, Node2vec, and Space2vec) quantitatively represented as 'protein deep profiles'. We adopted deep convolutional neural network (CNN) classifier to model protein deep profiles relationships with male infertility, creatively training deep CNN models of single-label binary classification and multi label eight classification. We demonstrate the capabilities of DPPCG framework by integrating and fully harnessing the utility of heterogeneous biomedical big data, including literature, protein sequences, protein-protein interactions, gene expressions, and gene-phenotype relationships, and effective indirect prediction of 794 causal genes of male infertility and associated pathological processes. We present this research in an interactive 'Smart Protein' intelligent (demo) system (http://www.smartprotein.cloud/public/home). Researchers can benefit from our intelligent system by (i) accessing a shallow gene/protein-radar service involving research status and a knowledge graph-based vertical search; (ii) querying and downloading protein deep profile matrices; (iii) accessing intelligent recommendations for causal genes of male infertility and associated pathological processes, and references for model architectures, parameter settings, and training outputs; and (iv) carrying out personalized analysis such as online K-Means clustering.
引用
收藏
页码:70 / 89
页数:20
相关论文
共 50 条
  • [21] DeepSuccinylSite: a deep learning based approach for protein succinylation site prediction
    Thapa, Niraj
    Chaudhari, Meenal
    McManus, Sean
    Roy, Kaushik
    Newman, Robert H.
    Saigo, Hiroto
    KC, Dukka B.
    BMC BIOINFORMATICS, 2020, 21 (Suppl 3)
  • [22] Classification and Identification of Male Hair Loss based on Deep Learning
    Liu, Lanhui
    Sulaiman, Nor Intan Saniah
    Liu, Fan
    Zhou, Shuya
    Huang, Zhendong
    Tan, Yuhao
    Cao, Cong
    PROCEEDINGS OF 2024 4TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND INTELLIGENT COMPUTING, BIC 2024, 2024, : 252 - 257
  • [23] A Deep Learning Model for Predicting Tumor Suppressor Genes and Oncogenes from PDB Structure
    Tavanaei, Amirhossein
    Anandanadarajah, Nishanth
    Maida, Anthony
    Loganantharaj, Rasiah
    2017 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2017, : 613 - 617
  • [24] Deep-ProBind: binding protein prediction with transformer-based deep learning model
    Khan, Salman
    Noor, Sumaiya
    Awan, Hamid Hussain
    Iqbal, Shehryar
    Alqahtani, Salman A.
    Dilshad, Naqqash
    Ahmad, Nijad
    BMC BIOINFORMATICS, 2025, 26 (01):
  • [25] Unified Deep Learning Model for Multitask Reaction Predictions with Explanation
    Lu, Jieyu
    Zhang, Yingkai
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2022, 62 (06) : 1376 - 1387
  • [26] Predicting Protein Phosphorylation Sites Based on Deep Learning
    Long, Haixia
    Sun, Zhao
    Li, Manzhi
    Fu, Hai Yan
    Lin, Ming Cai
    CURRENT BIOINFORMATICS, 2020, 15 (04) : 300 - 308
  • [27] Mining influential genes based on deep learning
    Lingpeng Kong
    Yuanyuan Chen
    Fengjiao Xu
    Mingmin Xu
    Zutan Li
    Jingya Fang
    Liangyun Zhang
    Cong Pian
    BMC Bioinformatics, 22
  • [28] A Generalized Explanation Framework for Visualization of Deep Learning Model Predictions
    Wang, Pei
    Vasconcelos, Nuno
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (08) : 9265 - 9283
  • [29] An Ensemble Deep Learning based Predictor for Simultaneously Identifying Protein Ubiquitylation and SUMOylation Sites
    Fei He
    Jingyi Li
    Rui Wang
    Xiaowei Zhao
    Ye Han
    BMC Bioinformatics, 22
  • [30] An Ensemble Deep Learning based Predictor for Simultaneously Identifying Protein Ubiquitylation and SUMOylation Sites
    He, Fei
    Li, Jingyi
    Wang, Rui
    Zhao, Xiaowei
    Han, Ye
    BMC BIOINFORMATICS, 2021, 22 (01)