Automated Prediction and Annotation of Small Open Reading Frames in Microbial Genomes

被引:24
|
作者
Durrant, Matthew G. [1 ,2 ]
Bhatt, Ami S. [1 ,2 ]
机构
[1] Stanford Univ, Dept Genet, Stanford, CA 94305 USA
[2] Stanford Univ, Dept Med Hematol Blood & Marrow Transplantat, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
RNA; ALIGNMENT; BACTERIAL; PROTEINS; HIDDEN; SUITE;
D O I
10.1016/j.chom.2020.11.002
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
Small open reading frames (smORFs) and their encoded microproteins play central roles in microbes. However, there is a vast unexplored space of smORFs within human-associated microbes. A recent bioinformatic analysis used evolutionary conservation signals to enhance prediction of small protein families. To facilitate the annotation of specific smORFs, we introduce SmORFinder. This tool combines profile hidden Markov models of each smORF family and deep learning models that better generalize to smORF families not seen in the training set, resulting in predictions enriched for Ribo-seq translation signals. Feature importance analysis reveals that the deep learning models learn to identify Shine-Dalgarno sequences, deprioritize the wobble position in each codon, and group codon synonyms found in the codon table. A core-genome analysis of 26 bacterial species identifies several core smORFs of unknown function. We pre-compute smORF annotations for thousands of RefSeq isolate genomes and Human Microbiome Project metagenomes and provide these data through a public web portal.
引用
收藏
页码:121 / +
页数:15
相关论文
共 46 条
  • [1] Accurate annotation of human protein-coding small open reading frames
    Martinez, Thomas F.
    Chu, Qian
    Donaldson, Cynthia
    Tan, Dan
    Shokhirev, Maxim N.
    Saghatelian, Alan
    NATURE CHEMICAL BIOLOGY, 2020, 16 (04) : 458 - +
  • [2] Sequence and Function Analysis of Peptide Coding Small Open Reading Frames in Prokaryotic Genomes
    Chen Yi-Ting
    Zhang Feng
    Zhao Jia
    Yu Jia-Feng
    Sha Yu-Jie
    Wang Ji-Hua
    PROGRESS IN BIOCHEMISTRY AND BIOPHYSICS, 2018, 45 (01) : 59 - 67
  • [3] Small Open Reading Frames: How Important Are They for Molecular Evolution?
    Guerra-Almeida, Diego
    Nunes-da-Fonseca, Rodrigo
    FRONTIERS IN GENETICS, 2020, 11
  • [4] Translation of Small Open Reading Frames: Roles in Regulation and Evolutionary Innovation
    Ruiz-Orera, Jorge
    Alba, M. Mar
    TRENDS IN GENETICS, 2019, 35 (03) : 186 - 198
  • [5] MEGAnnotator2: a pipeline for the assembly and annotation of microbial genomes
    Lugli, Gabriele Andrea
    Fontana, Federico
    Tarracchini, Chiara
    Milani, Christian
    Mancabelli, Leonardo
    Turroni, Francesca
    Ventura, Marco
    MICROBIOME RESEARCH REPORTS, 2023, 2 (02):
  • [6] Extensive translation of small Open Reading Frames revealed by Poly-Ribo-Seq
    Aspden, Julie L.
    Eyre-Walker, Ying Chen
    Philips, Rose J.
    Amin, Unum
    Mumtaz, Muhammad Ali S.
    Brocard, Michele
    Couso, Juan Pablo
    ELIFE, 2014, 3 : 1 - 19
  • [7] An Integrated Approach for Discovering Noncanonical MHC-I Peptides Encoded by Small Open Reading Frames
    Chen, Lei
    Zhang, Yuanliang
    Yang, Ying
    Yang, Yang
    Li, Huihui
    Dong, Xuan
    Wang, Hongwei
    Xie, Zhi
    Zhao, Qian
    JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 2021, 32 (09) : 2346 - 2357
  • [8] Drought stress differentially regulates the expression of small open reading frames (sORFs) in Arabidopsis roots and shoots
    Rasheed, Sultana
    Bashir, Khurram
    Nakaminami, Kentaro
    Hanada, Kousuke
    Matsui, Akihiro
    Seki, Motoaki
    PLANT SIGNALING & BEHAVIOR, 2016, 11 (08)
  • [9] Prediction of translation initiation site for microbial genomes with TriTISA
    Hu, Gang-Qing
    Zheng, Xiaobin
    Zhu, Huai-Qiu
    She, Zhen-Su
    BIOINFORMATICS, 2009, 25 (01) : 123 - 125
  • [10] Variation in upstream open reading frames contributes to allelic diversity in maize protein abundance
    Gage, Joseph L.
    Mali, Sujina
    McLoughlin, Fionn
    Khaipho-Burch, Merritt
    Monier, Brandon
    Bailey-Serres, Julia
    Vierstra, Richard D.
    Buckler, Edward S.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2022, 119 (14)