rSeqTU-A Machine-Learning Based R Package for Prediction of Bacteria Transcription Units

被引:5
|
作者
Niu, Sheng-Yong [1 ]
Liu, Binqiang [2 ]
Ma, Qin [3 ]
Chou, Wen-Chi [4 ]
机构
[1] Univ Calif San Diego, Dept Comp Sci & Engn, La Jolla, CA 92093 USA
[2] Shandong Univ, Sch Math, Jinan, Shandong, Peoples R China
[3] Ohio State Univ, Coll Med, Biomed Informat, Columbus, OH 43210 USA
[4] Broad Inst MIT & Harvard, Infect Dis & Microbiome Program, Cambridge, MA 02142 USA
基金
美国国家科学基金会;
关键词
machine learning; bacteria; transcription unit; R package; transcriptome;
D O I
10.3389/fgene.2019.00374
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
A transcription unit (TU) is composed of one or multiple adjacent genes on the same strand that are co-transcribed in mostly prokaryotes. Accurate identification of TUs is a crucial first step to delineate the transcriptional regulatory networks and elucidate the dynamic regulatory mechanisms encoded in various prokaryotic genomes. Many genomic features, for example, gene intergenic distance, and transcriptomic features including continuous and stable RNA-seq reads count signals, have been collected from a large amount of experimental data and integrated into classification techniques to computationally predict genome-wide TUs. Although some tools and web servers are able to predict TUs based on bacterial RNA-seq data and genome sequences, there is a need to have an improved machine learning prediction approach and a better comprehensive pipeline handling QC, TU prediction, and TU visualization. To enable users to efficiently perform TU identification on their local computers or high-performance clusters and provide a more accurate prediction, we develop an R package, named rSeqTU. rSeqTU uses a random forest algorithm to select essential features describing TUs and then uses support vector machine (SVM) to build TU prediction models. rSeqTU (available at https://s18692001.githubio/rSeqTU/) has six computational functionalities including read quality control, read mapping, training set generation, random forest-based feature selection, TU prediction, and TU visualization.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Application of Machine-Learning Algorithms to the Stratigraphic Correlation of Archean Shale Units Based on Lithogeochemistry
    Zhang, Steven E.
    Nwaila, Glen T.
    Bourdeau, Julie E.
    Frimmel, Hartwig E.
    Ghorbani, Yousef
    Elhabyan, Riham
    JOURNAL OF GEOLOGY, 2021, 129 (06): : 647 - 672
  • [22] Prediction of Bond Dissociation Energy for Organic Molecules Based on a Machine-Learning Approach
    Liu, Yidi
    Li, Yao
    Yang, Qi
    Yang, Jin-Dong
    Zhang, Long
    Luo, Sanzhong
    CHINESE JOURNAL OF CHEMISTRY, 2024, 42 (17) : 1967 - 1974
  • [23] CME Arrival Time Prediction Based on Coronagraph Observations and Machine-learning Techniques
    Li, Yucong
    Yang, Yi
    Shen, Fang
    Tang, Bofeng
    Lin, Rongpei
    ASTROPHYSICAL JOURNAL, 2024, 976 (01):
  • [24] Machine-learning based prediction of small molecule-surface interaction potentials
    Rouse, Ian
    Lobaskin, Vladimir
    FARADAY DISCUSSIONS, 2023, 244 (00) : 306 - 335
  • [25] Physics-based and machine-learning models for accurate scour depth prediction
    Jatoliya, Ajay
    Bhattacharya, Debayan
    Manna, Bappaditya
    Bento, Ana Margarida
    Ferradosa, Tiago Fazeres
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2024, 382 (2264):
  • [26] Machine-Learning based Tire-Road Friction Prediction for Ground Vehicles
    Scott, Perry
    Wang, Junmin
    IFAC PAPERSONLINE, 2022, 55 (37): : 217 - 222
  • [27] Machine-learning based treatment couch parameter prediction for surface guided radiotherapy
    De Kerf, G.
    Claessens, M.
    Mollaert, I.
    Vingerhoed, W.
    Sprangers, A.
    Verellen, D.
    RADIOTHERAPY AND ONCOLOGY, 2022, 170 : S781 - S782
  • [28] A prediction model for reactivation of Langerhans cell histiocytosis based on machine-learning algorithms
    Tan, Siqi
    Chen, Ziyan
    Hua, Xuefei
    Zhang, Suhan
    Zhu, Yanshan
    Wu, Ruifang
    Su, Yuwen
    Zhang, Peng
    Liu, Yu
    EUROPEAN JOURNAL OF DERMATOLOGY, 2024, 34 (02) : 109 - 118
  • [29] Prediction and Factor Analysis of Liquefaction Ground Subsidence Based on Machine-Learning Techniques
    Karimai, Kazuki
    Liu, Wen
    Maruyama, Yoshihisa
    APPLIED SCIENCES-BASEL, 2024, 14 (07):
  • [30] Machine-Learning Prediction of Underwater Shock Loading on Structures
    Zhang, Mou
    Drikakis, Dimitris
    Li, Lei
    Yan, Xiu
    COMPUTATION, 2019, 7 (04)