Sequence-Based Predicting Bacterial Essential ncRNAs Algorithm by Machine Learning

被引:0
|
作者
Ye, Yuan-Nong [1 ,2 ,3 ]
Liang, Ding-Fa [2 ]
Labena, Abraham Alemayehu [4 ]
Zeng, Zhu [2 ]
机构
[1] Guizhou Med Univ, Sch Big Hlth, Dept Med Informat, Bioinformat & Biomed Big data Min Lab, Guiyang 550025, Peoples R China
[2] Guizhou Med Univ, Cells & Antibody Engn Res Ctr Guizhou Prov, Sch Biol & Engn, Key Lab Biol & Med Engn, Guiyang 550025, Peoples R China
[3] Guizhou Med Univ, Key Lab Environm Pollut Monitoring & Dis Control, Minist Educ, Guiyang 550025, Peoples R China
[4] Dilla Univ, Coll Computat & Nat Sci, Dilla 419, Ethiopia
基金
中国国家自然科学基金;
关键词
Bioinformatics; biological information theory; biomedical informatics; PROTEIN;
D O I
10.32604/iasc.2023.026761
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Essential ncRNA is a type of ncRNA which is indispensable for the sur-vival of organisms. Although essential ncRNAs cannot encode proteins, they are as important as essential coding genes in biology. They have got wide variety of applications such as antimicrobial target discovery, minimal genome construction and evolution analysis. At present, the number of species required for the deter-mination of essential ncRNAs in the whole genome scale is still very few due to the traditional methods are time-consuming, laborious and costly. In addition, tra-ditional experimental methods are limited by the organisms as less than 1% of bacteria can be cultured in the laboratory. Therefore, it is important and necessary to develop theories and methods for the recognition of essential non-coding RNA. In this paper, we present a novel method for predicting essential ncRNA by using both compositional and derivative features calculated by information theory of ncRNA sequences. The method was developed with Support Vector Machine (SVM). The accuracy of the method was evaluated through cross-species cross -vali-dation and found to be between 0.69 and 0.81. It shows that the features we selected have good performance for the prediction of essential ncRNA using SVM. Thus, the method can be applied for discovering essential ncRNAs in bacteria.
引用
收藏
页码:2731 / 2741
页数:11
相关论文
共 50 条
  • [1] iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators
    Feng, Chao-Qin
    Zhang, Zhao-Yue
    Zhu, Xiao-Juan
    Lin, Yan
    Chen, Wei
    Tang, Hua
    Lin, Hao
    BIOINFORMATICS, 2019, 35 (09) : 1469 - 1477
  • [2] A Novel Sequence-Based Method of Predicting Protein DNA-Binding Residues, Using a Machine Learning Approach
    Cai, Yudong
    He, ZhiSong
    Shi, Xiaohe
    Kong, Xiangying
    Gu, Lei
    Xie, Lu
    MOLECULES AND CELLS, 2010, 30 (02) : 99 - 105
  • [3] A sequence-based machine learning model for predicting antigenic distance for H3N2 influenza virus
    Li, Xingyi
    Li, Yanyan
    Shang, Xuequn
    Kong, Huihui
    FRONTIERS IN MICROBIOLOGY, 2024, 15
  • [4] Classification of multi-family enzymes by multi-label machine learning and sequence-based descriptors
    Wang, Yuelong
    Jing, Runyu
    Hua, Yongpan
    Fu, Yuanyuan
    Dai, Xu
    Huang, Liqiu
    Li, Menglong
    ANALYTICAL METHODS, 2014, 6 (17) : 6832 - 6840
  • [5] LIBRUS: combined machine learning and homology information for sequence-based ligand-binding residue prediction
    Kauffman, Chris
    Karypis, George
    BIOINFORMATICS, 2009, 25 (23) : 3099 - 3107
  • [6] Predicting bacterial essential genes using only sequence composition information
    Ning, L. W.
    Lin, H.
    Ding, H.
    Huang, J.
    Rao, N.
    Guo, F. B.
    GENETICS AND MOLECULAR RESEARCH, 2014, 13 (02): : 4564 - 4572
  • [7] Predicting bacterial virulence factors - evaluation of machine learning and negative data strategies
    Rentzsch, Robert
    Deneke, Carlus
    Nitsche, Andreas
    Renard, Bernhard Y.
    BRIEFINGS IN BIOINFORMATICS, 2020, 21 (05) : 1596 - 1608
  • [8] Role of Statistical Moments and Various Sequence-based Features in Predicting Protein Functions
    Suleman, Muhammad Taseer
    4TH INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING (IC)2, 2021, : 813 - 818
  • [9] A sequence-based computational approach to predicting PDZ domain-peptide interactions
    Nakariyakul, Songyot
    Liu, Zhi-Ping
    Chen, Luonan
    BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS, 2014, 1844 (01): : 165 - 170
  • [10] Sequence Alignment Using Machine Learning-Based Needleman-Wunsch Algorithm
    El-Din Rashed, Amr Ezz
    Amer, Hanan M.
    El-Seddek, Mervat
    El-Din Moustafa, Hossam
    IEEE ACCESS, 2021, 9 : 109522 - 109535