deep-Sep: a deep learning-based method for fast and accurate prediction of selenoprotein genes in bacteria

被引:0
|
作者
Xiao, Yao [1 ]
Zhang, Yan [1 ,2 ]
机构
[1] Shenzhen Univ, Brain Dis & Big Data Res Inst, Coll Life Sci & Oceanog, Shenzhen Key Lab Marine Bioresources & Ecol, Shenzhen, Guangdong, Peoples R China
[2] Shenzhen Fundamental Res Inst, Shenzhen Hong Kong Inst Brain Sci, Shenzhen, Guangdong, Peoples R China
来源
MSYSTEMS | 2025年
基金
中国国家自然科学基金;
关键词
selenium; selenoprotein; UGA codon; deep learning; bacteria; SELENOCYSTEINE INSERTION; SELENIUM; IDENTIFICATION; ALGORITHM; ELEMENTS;
D O I
10.1128/msystems.01258-24
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
Selenoproteins are a special group of proteins with major roles in cellular antioxidant defense. They contain the 21st amino acid selenocysteine (Sec) in the active sites, which is encoded by an in-frame UGA codon. Compared to eukaryotes, identification of selenoprotein genes in bacteria remains challenging due to the absence of an effective strategy for distinguishing the Sec-encoding UGA codon from a normal stop signal. In this study, we have developed a deep learning-based algorithm, deep-Sep, for quickly and precisely identifying selenoprotein genes in bacterial genomic sequences. This algorithm uses a Transformer-based neural network architecture to construct an optimal model for detecting Sec-encoding UGA codons and a homology search-based strategy to remove additional false positives. During the training and testing stages, deep-Sep has demonstrated commendable performance, including an F-1 score of 0.939 and an area under the receiver operating characteristic curve of 0.987. Further more, when applied to 20 bacterial genomes as independent test data sets, deep-Sep exhibited remarkable capability in identifying both known and new selenoprotein genes, which significantly outperforms the existing state-of-the-art method. Our algorithm has proved to be a powerful tool for comprehensively characterizing selenoprotein genes in bacterial genomes, which should not only assist in accurate annotation of selenoprotein genes in genome sequencing projects but also provide new insights for a deeper understanding of the roles of selenium in bacteria.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] SpliceSCANNER: An Accurate and Interpretable Deep Learning-Based Method for Splice Site Prediction
    Wang, Rongxing
    Xu, Junwei
    Huang, Xiaodi
    Qi, Wangjing
    Zhang, Yanju
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT III, 2023, 14088 : 447 - 459
  • [2] Deep Learning-Based Framework for Fast and Accurate Acoustic Hologram Generation
    Lee, Moon Hwan
    Lew, Hah Min
    Youn, Sangyeon
    Kim, Tae
    Hwang, Jae Youn
    IEEE TRANSACTIONS ON ULTRASONICS FERROELECTRICS AND FREQUENCY CONTROL, 2022, 69 (12) : 3353 - 3366
  • [3] A Graph Deep Learning-Based Fast Traffic Flow Prediction Method in Urban Road Networks
    Yang, Dongfang
    Lv, Liping
    IEEE ACCESS, 2023, 11 : 93754 - 93763
  • [4] ResDeepGS: A Deep Learning-Based Method for Crop Phenotype Prediction
    Yan, Chaokun
    Li, Jiabao
    Feng, Qi
    Luo, Junwei
    Luo, Huimin
    BIOINFORMATICS RESEARCH AND APPLICATIONS, PT II, ISBRA 2024, 2024, 14955 : 470 - 481
  • [5] Deep Learning-based Prediction Method for People Flows and Their Anomalies
    Takano, Shigeru
    Hori, Maiya
    Goto, Takayuki
    Uchida, Seiichi
    Kurazume, Ryo
    Taniguchi, Rin-ichiro
    ICPRAM: PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS, 2017, : 676 - 683
  • [6] DeepSCC: Deep Learning-Based Fast Prediction Network for Screen Content Coding
    Kuang, Wei
    Chan, Yui-Lam
    Tsang, Sik-Ho
    Siu, Wan-Chi
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (07) : 1917 - 1932
  • [7] An Enhanced Deep Learning-Based Fusion Prognostic Method for RUL Prediction
    Huang, Cheng-Geng
    Yin, Xianhui
    Huang, Hong-Zhong
    Li, Yan-Feng
    IEEE TRANSACTIONS ON RELIABILITY, 2020, 69 (03) : 1097 - 1109
  • [8] Deep Learning-Based Conformal Prediction of Toxicity
    Zhang, Jin
    Norinder, Ulf
    Svensson, Fredrik
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2021, 61 (06) : 2648 - 2657
  • [9] A deep learning-based method for the prediction of DNA interacting residues in a protein
    Patiyal, Sumeet
    Dhall, Anjali
    Raghava, Gajendra P. S.
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (05)
  • [10] Deep learning-based dose prediction for INTRABEAM
    Abushawish, Mojahed
    Galapon, Arthur V.
    Herraiz, Joaquin L.
    Udias, Jose M.
    Ibanez, Paula
    RADIOTHERAPY AND ONCOLOGY, 2024, 194 : S4472 - S4474