deep-Sep: a deep learning-based method for fast and accurate prediction of selenoprotein genes in bacteria

被引:0
|
作者
Xiao, Yao [1 ]
Zhang, Yan [1 ,2 ]
机构
[1] Shenzhen Univ, Brain Dis & Big Data Res Inst, Coll Life Sci & Oceanog, Shenzhen Key Lab Marine Bioresources & Ecol, Shenzhen, Guangdong, Peoples R China
[2] Shenzhen Fundamental Res Inst, Shenzhen Hong Kong Inst Brain Sci, Shenzhen, Guangdong, Peoples R China
来源
MSYSTEMS | 2025年
基金
中国国家自然科学基金;
关键词
selenium; selenoprotein; UGA codon; deep learning; bacteria; SELENOCYSTEINE INSERTION; SELENIUM; IDENTIFICATION; ALGORITHM; ELEMENTS;
D O I
10.1128/msystems.01258-24
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
Selenoproteins are a special group of proteins with major roles in cellular antioxidant defense. They contain the 21st amino acid selenocysteine (Sec) in the active sites, which is encoded by an in-frame UGA codon. Compared to eukaryotes, identification of selenoprotein genes in bacteria remains challenging due to the absence of an effective strategy for distinguishing the Sec-encoding UGA codon from a normal stop signal. In this study, we have developed a deep learning-based algorithm, deep-Sep, for quickly and precisely identifying selenoprotein genes in bacterial genomic sequences. This algorithm uses a Transformer-based neural network architecture to construct an optimal model for detecting Sec-encoding UGA codons and a homology search-based strategy to remove additional false positives. During the training and testing stages, deep-Sep has demonstrated commendable performance, including an F-1 score of 0.939 and an area under the receiver operating characteristic curve of 0.987. Further more, when applied to 20 bacterial genomes as independent test data sets, deep-Sep exhibited remarkable capability in identifying both known and new selenoprotein genes, which significantly outperforms the existing state-of-the-art method. Our algorithm has proved to be a powerful tool for comprehensively characterizing selenoprotein genes in bacterial genomes, which should not only assist in accurate annotation of selenoprotein genes in genome sequencing projects but also provide new insights for a deeper understanding of the roles of selenium in bacteria.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] A fast and accurate deep learning method for strawberry instance segmentation
    Perez-Borrero, Isaac
    Marin-Santos, Diego
    Gegundez-Arias, Manuel E.
    Cortes-Ancos, Estefania
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2020, 178 (178)
  • [22] DeepPBI-KG: a deep learning method for the prediction of phage-bacteria interactions based on key genes
    Wei, Tongqing
    Lu, Chenqi
    Du, Hanxiao
    Yang, Qianru
    Qi, Xin
    Liu, Yankun
    Zhang, Yi
    Chen, Chen
    Li, Yutong
    Tang, Yuanhao
    Zhang, Wen-Hong
    Tao, Xu
    Jiang, Ning
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (06)
  • [23] Deep Learning-Based Phase Unwrapping Method
    Li, Dongxu
    Xie, Xianming
    IEEE ACCESS, 2023, 11 : 85836 - 85851
  • [24] Machine Learning and Deep Learning-Based Students’ Grade Prediction
    Korchi A.
    Messaoudi F.
    Abatal A.
    Manzali Y.
    Operations Research Forum, 4 (4)
  • [25] A Fast Accurate Deep Learning Framework for Prediction of All Cancer Types
    Fadel, Magdy M.
    Elseddeq, Nadia G.
    Arnous, Reham
    Ali, Zainab H.
    Eldesouky, Ali I.
    IEEE Access, 2022, 10 : 122586 - 122600
  • [26] UFold: fast and accurate RNA secondary structure prediction with deep learning
    Fu, Laiyi
    Cao, Yingxin
    Wu, Jie
    Peng, Qinke
    Nie, Qing
    Xie, Xiaohui
    NUCLEIC ACIDS RESEARCH, 2022, 50 (03) : E14
  • [27] A Fast Accurate Deep Learning Framework for Prediction of All Cancer Types
    Fadel, Magdy M.
    Elseddeq, Nadia G.
    Arnous, Reham
    Ali, Zainab H.
    Eldesouky, Ali I.
    IEEE ACCESS, 2022, 10 : 122586 - 122600
  • [28] A deep learning-based dose prediction method for evaluation of radiotherapy treatment planning
    Liu, Jiping
    Zhang, Xiang
    Cheng, Xiaolong
    Sun, Long
    JOURNAL OF RADIATION RESEARCH AND APPLIED SCIENCES, 2024, 17 (01)
  • [29] A deep learning-based multi-model ensemble method for cancer prediction
    Xiao, Yawen
    Wu, Jun
    Lin, Zongli
    Zhao, Xiaodong
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2018, 153 : 1 - 9
  • [30] SEABIG: A Deep Learning-Based Method for Location Prediction in Pedestrian Semantic Trajectories
    Zhang, Wanlong
    Sun, Citing
    Wang, Xiang
    Huang, Zhitao
    Li, Baoguo
    IEEE ACCESS, 2019, 7 : 109054 - 109062