Protein deep profile and model predictions for identifying the causal genes of male infertility based on deep learning

被引:6
|
作者
Xu, Fang [1 ,4 ]
Guo, Ganggang [2 ]
Zhu, Feida [3 ]
Tan, Xiaojun [4 ]
Fan, Liqing [1 ]
机构
[1] Cent South Univ, Sch Basic Med Sci, Inst Reprod & Stem Cell Engn, Changsha 410000, Peoples R China
[2] Cent South Univ, Sch Business, Changsha 410000, Peoples R China
[3] Singapore Management Univ, Sch Informat Syst, Stamford Rd, Singapore 178902, Singapore
[4] Cent South Univ, Clin Practice Base, Xiangtan Cent Hosp, Xiangtan 411100, Peoples R China
基金
中国国家自然科学基金;
关键词
Data integration; Disease phenotype; Male infertility; Causal gene; Knowledge representation; Convolutional neural network; Manifold learning; Deep learning; GENOMIC DATA; DISEASE ASSOCIATIONS; CLINICAL VALIDITY; NETWORK; FUSION; CLASSIFICATION; SEQUENCE; HEALTH; OMICS; REPRESENTATION;
D O I
10.1016/j.inffus.2021.04.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A principal task in dissecting the genetics of complex traits is to identify causal genes for disease phenotypes. Millions of genes have been sequenced in data-driven genomics era, but their causal relationships with disease phenotypes remain limited, due to the difficulty of elucidating underlying causal genes by laboratory based strategies. Here, we proposed an innovative deep learning computational modeling alternative (DPPCG framework) for identifying causal (coding) genes for a specific disease phenotype. In terms of male infertility, we introduced proteins as intermediate cell variables, leveraging integrated deep knowledge representations (Word2vec, ProtVec, Node2vec, and Space2vec) quantitatively represented as 'protein deep profiles'. We adopted deep convolutional neural network (CNN) classifier to model protein deep profiles relationships with male infertility, creatively training deep CNN models of single-label binary classification and multi label eight classification. We demonstrate the capabilities of DPPCG framework by integrating and fully harnessing the utility of heterogeneous biomedical big data, including literature, protein sequences, protein-protein interactions, gene expressions, and gene-phenotype relationships, and effective indirect prediction of 794 causal genes of male infertility and associated pathological processes. We present this research in an interactive 'Smart Protein' intelligent (demo) system (http://www.smartprotein.cloud/public/home). Researchers can benefit from our intelligent system by (i) accessing a shallow gene/protein-radar service involving research status and a knowledge graph-based vertical search; (ii) querying and downloading protein deep profile matrices; (iii) accessing intelligent recommendations for causal genes of male infertility and associated pathological processes, and references for model architectures, parameter settings, and training outputs; and (iv) carrying out personalized analysis such as online K-Means clustering.
引用
收藏
页码:70 / 89
页数:20
相关论文
共 50 条
  • [1] DeepRepViz: Identifying Potential Confounders in Deep Learning Model Predictions
    Rane, Roshan Prakash
    Kim, JiHoon
    Umesha, Arjun
    Stark, Didem
    Schulz, Marc-Andre
    Ritter, Kerstin
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT X, 2024, 15010 : 186 - 196
  • [2] BGFE: A Deep Learning Model for ncRNA-Protein Interaction Predictions Based on Improved Sequence Information
    Zhan, Zhao-Hui
    Jia, Li-Na
    Zhou, Yong
    Li, Li-Ping
    Yi, Hai-Cheng
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2019, 20 (04)
  • [3] IChrom-Deep: An Attention-Based Deep Learning Model for Identifying Chromatin Interactions
    Zhang, Pengyu
    Wu, Hao
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (09) : 4559 - 4568
  • [4] A prediction model of nonclassical secreted protein based on deep learning
    Zhang, Fan
    Liu, Chaoyang
    Wang, Binjie
    He, Yiru
    Zhang, Xinhong
    JOURNAL OF CHEMOMETRICS, 2024, 38 (08)
  • [5] Sweetener classification model based on deep learning
    Xiao L.
    Chen A.
    Zhou G.
    Yi J.
    Nongye Gongcheng Xuebao/Transactions of the Chinese Society of Agricultural Engineering, 2021, 37 (11): : 285 - 291
  • [6] Model-Based Deep Learning: On the Intersection of Deep Learning and Optimization
    Shlezinger, Nir
    Eldar, Yonina C.
    Boyd, Stephen P.
    IEEE ACCESS, 2022, 10 : 115384 - 115398
  • [7] Geometric Deep Learning for Protein-Protein Interaction Predictions
    Lemieux, Gabriel St-Pierre
    Paquet, Eric
    Viktor, Herna L.
    Michalowski, Wojtek
    IEEE ACCESS, 2022, 10 : 90045 - 90055
  • [8] Medical Image Classification based on an Adaptive Size Deep Learning Model
    Liu, Xiangbin
    He, Jiesheng
    Song, Liping
    Liu, Shuai
    Srivastava, Gautam
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (03)
  • [9] Mining influential genes based on deep learning
    Kong, Lingpeng
    Chen, Yuanyuan
    Xu, Fengjiao
    Xu, Mingmin
    Li, Zutan
    Fang, Jingya
    Zhang, Liangyun
    Pian, Cong
    BMC BIOINFORMATICS, 2021, 22 (01)
  • [10] Deep Learning Model for Protein Disease Classification
    Mostafa, Farida Alaaeldin
    Afify, Yasmine Mohamed
    Ismail, Rasha Mohamed
    Badr, Nagwa Lotfy
    CURRENT BIOINFORMATICS, 2022, 17 (03) : 245 - 253