deep-Sep: a deep learning-based method for fast and accurate prediction of selenoprotein genes in bacteria
被引:0
|
作者:
Xiao, Yao
论文数: 0引用数: 0
h-index: 0
机构:
Shenzhen Univ, Brain Dis & Big Data Res Inst, Coll Life Sci & Oceanog, Shenzhen Key Lab Marine Bioresources & Ecol, Shenzhen, Guangdong, Peoples R ChinaShenzhen Univ, Brain Dis & Big Data Res Inst, Coll Life Sci & Oceanog, Shenzhen Key Lab Marine Bioresources & Ecol, Shenzhen, Guangdong, Peoples R China
Xiao, Yao
[1
]
Zhang, Yan
论文数: 0引用数: 0
h-index: 0
机构:
Shenzhen Univ, Brain Dis & Big Data Res Inst, Coll Life Sci & Oceanog, Shenzhen Key Lab Marine Bioresources & Ecol, Shenzhen, Guangdong, Peoples R China
Shenzhen Fundamental Res Inst, Shenzhen Hong Kong Inst Brain Sci, Shenzhen, Guangdong, Peoples R ChinaShenzhen Univ, Brain Dis & Big Data Res Inst, Coll Life Sci & Oceanog, Shenzhen Key Lab Marine Bioresources & Ecol, Shenzhen, Guangdong, Peoples R China
Zhang, Yan
[1
,2
]
机构:
[1] Shenzhen Univ, Brain Dis & Big Data Res Inst, Coll Life Sci & Oceanog, Shenzhen Key Lab Marine Bioresources & Ecol, Shenzhen, Guangdong, Peoples R China
[2] Shenzhen Fundamental Res Inst, Shenzhen Hong Kong Inst Brain Sci, Shenzhen, Guangdong, Peoples R China
selenium;
selenoprotein;
UGA codon;
deep learning;
bacteria;
SELENOCYSTEINE INSERTION;
SELENIUM;
IDENTIFICATION;
ALGORITHM;
ELEMENTS;
D O I:
10.1128/msystems.01258-24
中图分类号:
Q93 [微生物学];
学科分类号:
071005 ;
100705 ;
摘要:
Selenoproteins are a special group of proteins with major roles in cellular antioxidant defense. They contain the 21st amino acid selenocysteine (Sec) in the active sites, which is encoded by an in-frame UGA codon. Compared to eukaryotes, identification of selenoprotein genes in bacteria remains challenging due to the absence of an effective strategy for distinguishing the Sec-encoding UGA codon from a normal stop signal. In this study, we have developed a deep learning-based algorithm, deep-Sep, for quickly and precisely identifying selenoprotein genes in bacterial genomic sequences. This algorithm uses a Transformer-based neural network architecture to construct an optimal model for detecting Sec-encoding UGA codons and a homology search-based strategy to remove additional false positives. During the training and testing stages, deep-Sep has demonstrated commendable performance, including an F-1 score of 0.939 and an area under the receiver operating characteristic curve of 0.987. Further more, when applied to 20 bacterial genomes as independent test data sets, deep-Sep exhibited remarkable capability in identifying both known and new selenoprotein genes, which significantly outperforms the existing state-of-the-art method. Our algorithm has proved to be a powerful tool for comprehensively characterizing selenoprotein genes in bacterial genomes, which should not only assist in accurate annotation of selenoprotein genes in genome sequencing projects but also provide new insights for a deeper understanding of the roles of selenium in bacteria.
机构:
Faculty of Juridical, Economic and Social Sciences, Chouaib Doukkali University, El JadidaFaculty of Juridical, Economic and Social Sciences, Chouaib Doukkali University, El Jadida
Korchi A.
Messaoudi F.
论文数: 0引用数: 0
h-index: 0
机构:
National School of Commerce and Management, Sidi Mohamed Ben Abdellah University, FezFaculty of Juridical, Economic and Social Sciences, Chouaib Doukkali University, El Jadida
Messaoudi F.
Abatal A.
论文数: 0引用数: 0
h-index: 0
机构:
Faculty of Sciences and Techniques, Hassan Premier University, SettatFaculty of Juridical, Economic and Social Sciences, Chouaib Doukkali University, El Jadida
Abatal A.
Manzali Y.
论文数: 0引用数: 0
h-index: 0
机构:
Faculty of Science Dhar El Mahraz, Sidi Mohamed Ben Abdellah University, FezFaculty of Juridical, Economic and Social Sciences, Chouaib Doukkali University, El Jadida
机构:
Faculty of Engineering, Department of Computer Engineering and Control Systems, Mansoura University, Mansoura,35516, EgyptFaculty of Engineering, Department of Computer Engineering and Control Systems, Mansoura University, Mansoura,35516, Egypt
Fadel, Magdy M.
Elseddeq, Nadia G.
论文数: 0引用数: 0
h-index: 0
机构:
Faculty of Engineering, Department of Computer Engineering and Control Systems, Mansoura University, Mansoura,35516, EgyptFaculty of Engineering, Department of Computer Engineering and Control Systems, Mansoura University, Mansoura,35516, Egypt
Elseddeq, Nadia G.
Arnous, Reham
论文数: 0引用数: 0
h-index: 0
机构:
Delta Higher Institute for Engineering and Technology (DHIET), Department of Communications and Electronics, Mansoura,35111, EgyptFaculty of Engineering, Department of Computer Engineering and Control Systems, Mansoura University, Mansoura,35516, Egypt
Arnous, Reham
Ali, Zainab H.
论文数: 0引用数: 0
h-index: 0
机构:
Faculty of Artificial Intelligence, Department of Embedded Network Systems and Technology, Kafrelsheikh University, Kafrelsheikh,33512, EgyptFaculty of Engineering, Department of Computer Engineering and Control Systems, Mansoura University, Mansoura,35516, Egypt
Ali, Zainab H.
Eldesouky, Ali I.
论文数: 0引用数: 0
h-index: 0
机构:
Faculty of Engineering, Department of Computer Engineering and Control Systems, Mansoura University, Mansoura,35516, EgyptFaculty of Engineering, Department of Computer Engineering and Control Systems, Mansoura University, Mansoura,35516, Egypt
机构:
Shanghai Jiao Tong Univ, Dept Automat, Shanghai 200240, Peoples R China
Minist Educ, Key Lab Syst Control & Informat Proc, Shanghai 200240, Peoples R ChinaShanghai Jiao Tong Univ, Dept Automat, Shanghai 200240, Peoples R China
Xiao, Yawen
Wu, Jun
论文数: 0引用数: 0
h-index: 0
机构:
Shanghai Jiao Tong Univ, Sch Biomed Engn, Shanghai 200240, Peoples R ChinaShanghai Jiao Tong Univ, Dept Automat, Shanghai 200240, Peoples R China
Wu, Jun
Lin, Zongli
论文数: 0引用数: 0
h-index: 0
机构:
Univ Virginia, Charles L Brown Dept Elect & Comp Engn, POB 400743, Charlottesville, VA 22904 USAShanghai Jiao Tong Univ, Dept Automat, Shanghai 200240, Peoples R China
Lin, Zongli
Zhao, Xiaodong
论文数: 0引用数: 0
h-index: 0
机构:
Shanghai Jiao Tong Univ, Sch Biomed Engn, Shanghai 200240, Peoples R ChinaShanghai Jiao Tong Univ, Dept Automat, Shanghai 200240, Peoples R China