rSeqTU-A Machine-Learning Based R Package for Prediction of Bacteria Transcription Units

被引:5
|
作者
Niu, Sheng-Yong [1 ]
Liu, Binqiang [2 ]
Ma, Qin [3 ]
Chou, Wen-Chi [4 ]
机构
[1] Univ Calif San Diego, Dept Comp Sci & Engn, La Jolla, CA 92093 USA
[2] Shandong Univ, Sch Math, Jinan, Shandong, Peoples R China
[3] Ohio State Univ, Coll Med, Biomed Informat, Columbus, OH 43210 USA
[4] Broad Inst MIT & Harvard, Infect Dis & Microbiome Program, Cambridge, MA 02142 USA
基金
美国国家科学基金会;
关键词
machine learning; bacteria; transcription unit; R package; transcriptome;
D O I
10.3389/fgene.2019.00374
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
A transcription unit (TU) is composed of one or multiple adjacent genes on the same strand that are co-transcribed in mostly prokaryotes. Accurate identification of TUs is a crucial first step to delineate the transcriptional regulatory networks and elucidate the dynamic regulatory mechanisms encoded in various prokaryotic genomes. Many genomic features, for example, gene intergenic distance, and transcriptomic features including continuous and stable RNA-seq reads count signals, have been collected from a large amount of experimental data and integrated into classification techniques to computationally predict genome-wide TUs. Although some tools and web servers are able to predict TUs based on bacterial RNA-seq data and genome sequences, there is a need to have an improved machine learning prediction approach and a better comprehensive pipeline handling QC, TU prediction, and TU visualization. To enable users to efficiently perform TU identification on their local computers or high-performance clusters and provide a more accurate prediction, we develop an R package, named rSeqTU. rSeqTU uses a random forest algorithm to select essential features describing TUs and then uses support vector machine (SVM) to build TU prediction models. rSeqTU (available at https://s18692001.githubio/rSeqTU/) has six computational functionalities including read quality control, read mapping, training set generation, random forest-based feature selection, TU prediction, and TU visualization.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] An investigation on machine-learning models for the prediction of cyanobacteria growth
    Giere, Johannes
    Riley, Derek
    Nowling, R. J.
    McComack, Joshua
    Sander, Hedda
    FUNDAMENTAL AND APPLIED LIMNOLOGY, 2020, 194 (02) : 85 - 94
  • [32] Machine-Learning Applications in Structural Response Prediction: A Review
    Afshar, Aref
    Nouri, Gholamreza
    Ghazvineh, Shahin
    Hosseini Lavassani, Seyed Hossein
    PRACTICE PERIODICAL ON STRUCTURAL DESIGN AND CONSTRUCTION, 2024, 29 (03)
  • [33] Prediction of Hemolytic Toxicity for Saponins by Machine-Learning Methods
    Zheng, Suqing
    Wang, Yibing
    Liu, Hongmei
    Chang, Wenping
    Xu, Yong
    Lin, Fu
    CHEMICAL RESEARCH IN TOXICOLOGY, 2019, 32 (06) : 1014 - 1026
  • [34] Machine-learning techniques for the prediction of protein–protein interactions
    Debasree Sarkar
    Sudipto Saha
    Journal of Biosciences, 2019, 44
  • [35] Machine-learning models for prediction of sepsis patients mortality
    Bao, C.
    Deng, F.
    Zhao, S.
    MEDICINA INTENSIVA, 2023, 47 (06) : 315 - 325
  • [36] Energy landscapes for a machine-learning prediction of patient discharge
    Das, Ritankar
    Wales, David J.
    PHYSICAL REVIEW E, 2016, 93 (06)
  • [37] Performance Prediction of NUMA Placement: a Machine-Learning Approach
    Arapidis, Fanourios
    Karakostas, Vasileios
    Papadopoulou, Nikela
    Nikas, Konstantinos
    Goumas, Georgios
    Koziris, Nectarios
    2018 16TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM 2018), 2018, : 296 - 301
  • [38] Representation of compounds for machine-learning prediction of physical properties
    Seko, Atsuto
    Hayashi, Hiroyuki
    Nakayama, Keita
    Takahashi, Akira
    Tanaka, Isao
    PHYSICAL REVIEW B, 2017, 95 (14)
  • [39] Design of Machine-Learning Classifier for Stock Market Prediction
    Srivastava A.K.
    Srivastava A.
    Singh S.
    Sugandha S.
    Tripta
    Gupta S.
    SN Computer Science, 2022, 3 (1)
  • [40] Machine-learning methods for stream water temperature prediction
    Feigl, Moritz
    Lebiedzinski, Katharina
    Herrnegger, Mathew
    Schulz, Karsten
    HYDROLOGY AND EARTH SYSTEM SCIENCES, 2021, 25 (05) : 2951 - 2977