deep-Sep: a deep learning-based method for fast and accurate prediction of selenoprotein genes in bacteria

被引:0
|
作者
Xiao, Yao [1 ]
Zhang, Yan [1 ,2 ]
机构
[1] Shenzhen Univ, Brain Dis & Big Data Res Inst, Coll Life Sci & Oceanog, Shenzhen Key Lab Marine Bioresources & Ecol, Shenzhen, Guangdong, Peoples R China
[2] Shenzhen Fundamental Res Inst, Shenzhen Hong Kong Inst Brain Sci, Shenzhen, Guangdong, Peoples R China
来源
MSYSTEMS | 2025年
基金
中国国家自然科学基金;
关键词
selenium; selenoprotein; UGA codon; deep learning; bacteria; SELENOCYSTEINE INSERTION; SELENIUM; IDENTIFICATION; ALGORITHM; ELEMENTS;
D O I
10.1128/msystems.01258-24
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
Selenoproteins are a special group of proteins with major roles in cellular antioxidant defense. They contain the 21st amino acid selenocysteine (Sec) in the active sites, which is encoded by an in-frame UGA codon. Compared to eukaryotes, identification of selenoprotein genes in bacteria remains challenging due to the absence of an effective strategy for distinguishing the Sec-encoding UGA codon from a normal stop signal. In this study, we have developed a deep learning-based algorithm, deep-Sep, for quickly and precisely identifying selenoprotein genes in bacterial genomic sequences. This algorithm uses a Transformer-based neural network architecture to construct an optimal model for detecting Sec-encoding UGA codons and a homology search-based strategy to remove additional false positives. During the training and testing stages, deep-Sep has demonstrated commendable performance, including an F-1 score of 0.939 and an area under the receiver operating characteristic curve of 0.987. Further more, when applied to 20 bacterial genomes as independent test data sets, deep-Sep exhibited remarkable capability in identifying both known and new selenoprotein genes, which significantly outperforms the existing state-of-the-art method. Our algorithm has proved to be a powerful tool for comprehensively characterizing selenoprotein genes in bacterial genomes, which should not only assist in accurate annotation of selenoprotein genes in genome sequencing projects but also provide new insights for a deeper understanding of the roles of selenium in bacteria.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] A deep learning-based multi-model ensemble method for cancer prediction
    Xiao, Yawen
    Wu, Jun
    Lin, Zongli
    Zhao, Xiaodong
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2018, 153 : 1 - 9
  • [22] Deep Learning-Based Spectrum Reconstruction Method for Raman Spectroscopy
    Zhou, Qian
    Zou, Zhiyong
    Han, Lin
    COATINGS, 2022, 12 (08)
  • [23] Evaluation of Deep Learning-based prediction models in Microgrids
    Gyoeri, Alexey
    Niederau, Mathis
    Zeller, Violett
    Stich, Volker
    2019 IEEE CONFERENCE ON ENERGY CONVERSION (CENCON), 2019, : 95 - 99
  • [24] Deep Learning-Based Traffic Prediction for Network Optimization
    Troia, Sebastian
    Alvizu, Rodolfo
    Zhou, Youduo
    Maier, Guido
    Pattavina, Achille
    2018 20TH ANNIVERSARY INTERNATIONAL CONFERENCE ON TRANSPARENT OPTICAL NETWORKS (ICTON), 2018,
  • [25] Deep Learning-Based Driving Maneuver Prediction System
    Ou, Chaojie
    Karray, Fakhri
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2020, 69 (02) : 1328 - 1340
  • [26] A Survey of Deep Learning-Based Information Cascade Prediction
    Wang, Zhengang
    Wang, Xin
    Xiong, Fei
    Chen, Hongshu
    SYMMETRY-BASEL, 2024, 16 (11):
  • [27] Deep Learning-Based Defect Prediction for Mobile Applications
    Jorayeva, Manzura
    Akbulut, Akhan
    Catal, Cagatay
    Mishra, Alok
    SENSORS, 2022, 22 (13)
  • [28] A deep learning-based framework for road traffic prediction
    Redouane Benabdallah Benarmas
    Kadda Beghdad Bey
    The Journal of Supercomputing, 2024, 80 : 6891 - 6916
  • [29] A deep learning-based framework for road traffic prediction
    Benarmas, Redouane Benabdallah
    Bey, Kadda Beghdad
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (05) : 6891 - 6916
  • [30] Deep Learning-Based Discrete Calibrated Survival Prediction
    Fuhlert, Patrick
    Ernst, Anne
    Dietrich, Esther
    Westhaeusser, Fabian
    Kloiber, Karin
    Bonn, Stefan
    2022 IEEE INTERNATIONAL CONFERENCE ON DIGITAL HEALTH (IEEE ICDH 2022), 2022, : 169 - 174