Sequence-Based Predicting Bacterial Essential ncRNAs Algorithm by Machine Learning

被引:0
|
作者
Ye, Yuan-Nong [1 ,2 ,3 ]
Liang, Ding-Fa [2 ]
Labena, Abraham Alemayehu [4 ]
Zeng, Zhu [2 ]
机构
[1] Guizhou Med Univ, Sch Big Hlth, Dept Med Informat, Bioinformat & Biomed Big data Min Lab, Guiyang 550025, Peoples R China
[2] Guizhou Med Univ, Cells & Antibody Engn Res Ctr Guizhou Prov, Sch Biol & Engn, Key Lab Biol & Med Engn, Guiyang 550025, Peoples R China
[3] Guizhou Med Univ, Key Lab Environm Pollut Monitoring & Dis Control, Minist Educ, Guiyang 550025, Peoples R China
[4] Dilla Univ, Coll Computat & Nat Sci, Dilla 419, Ethiopia
基金
中国国家自然科学基金;
关键词
Bioinformatics; biological information theory; biomedical informatics; PROTEIN;
D O I
10.32604/iasc.2023.026761
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Essential ncRNA is a type of ncRNA which is indispensable for the sur-vival of organisms. Although essential ncRNAs cannot encode proteins, they are as important as essential coding genes in biology. They have got wide variety of applications such as antimicrobial target discovery, minimal genome construction and evolution analysis. At present, the number of species required for the deter-mination of essential ncRNAs in the whole genome scale is still very few due to the traditional methods are time-consuming, laborious and costly. In addition, tra-ditional experimental methods are limited by the organisms as less than 1% of bacteria can be cultured in the laboratory. Therefore, it is important and necessary to develop theories and methods for the recognition of essential non-coding RNA. In this paper, we present a novel method for predicting essential ncRNA by using both compositional and derivative features calculated by information theory of ncRNA sequences. The method was developed with Support Vector Machine (SVM). The accuracy of the method was evaluated through cross-species cross -vali-dation and found to be between 0.69 and 0.81. It shows that the features we selected have good performance for the prediction of essential ncRNA using SVM. Thus, the method can be applied for discovering essential ncRNAs in bacteria.
引用
收藏
页码:2731 / 2741
页数:11
相关论文
共 50 条
  • [21] Predicting cyclins based on key features and machine learning methods
    Wu, Cheng-Yan
    Xu, Zhi-Xue
    Li, Nan
    Qi, Dan-Yang
    Wu, Hong-Ye
    Ding, Hui
    Jin, Yan-Ting
    METHODS, 2025, 234 : 112 - 119
  • [22] Predicting protein-RNA interaction using sequence derived features and machine learning approach
    Pandey, Chandan
    Sandeep, Rokkam
    Priyam, Aikansh
    Mahapatra, Satyajit
    Sahu, Sitanshu Sekhar
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2018, 19 (03) : 270 - 282
  • [23] ACNNT3: Attention-CNN Framework for Prediction of Sequence-Based Bacterial Type III Secreted Effectors
    Li, Jie
    Li, Zhong
    Luo, Jiesi
    Yao, Yuhua
    COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2020, 2020
  • [24] Predicting the bacterial host range of plasmid genomes using the language model- based one- class support vector machine algorithm
    Feng, Tao
    Chen, Xirao
    Wu, Shufang
    Tang, Waijiao
    Zhou, Hongwei
    Fang, Zhencheng
    MICROBIAL GENOMICS, 2025, 11 (02):
  • [25] Predicting metabolic pathways of plant enzymes without using sequence similarity: Models from machine learning
    Almeida, Rodrigo de Oliveira
    Valente, Guilherme Targino
    PLANT GENOME, 2020, 13 (03)
  • [26] BBPpred: Sequence-Based Prediction of Blood-Brain Barrier Peptides with Feature Representation Learning and Logistic Regression
    Dai, Ruyu
    Zhang, Wei
    Tang, Wending
    Wynendaele, Evelien
    Zhu, Qizhi
    Bin, Yannan
    De Spiegeleer, Bart
    Xia, Junfeng
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2021, 61 (01) : 525 - 534
  • [27] TargetDBP: Accurate DNA-Binding Protein Prediction Via Sequence-Based Multi-View Feature Learning
    Hu, Jun
    Zhou, Xiao-Gen
    Zhu, Yi-Heng
    Yu, Dong-Jun
    Zhang, Gui-Jun
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 17 (04) : 1419 - 1429
  • [28] A novel sequence-based prediction method for ATP-binding sites using fusion of SMOTE algorithm and random forests classifier
    Song, Jiazhi
    Liu, Guixia
    Song, Chuyi
    Jiang, Jingqing
    BIOTECHNOLOGY & BIOTECHNOLOGICAL EQUIPMENT, 2020, 34 (01) : 1337 - 1347
  • [29] OOgenesis_Pred: A sequence-based method for predicting oogenesis proteins by six different modes of Chou's pseudo amino acid composition
    Rahimi, Maryam
    Bakhtiarizadeh, Mohammad Reza
    Mohammadi-Sangcheshmeh, Abdollah
    JOURNAL OF THEORETICAL BIOLOGY, 2017, 414 : 128 - 136
  • [30] GEARS: A Genetic Algorithm Based Machine Learning Technique to Develop Prediction Models
    Qazi, Wajahat M.
    Iqbal, Zeeshan
    Khan, M. Saleem
    Rehan, Muhammad
    Munir, Jawaria
    Shakoori, Abdul Rauf
    Nasir-ud-Din
    PAKISTAN JOURNAL OF ZOOLOGY, 2014, 46 (02) : 409 - 416