Automated Prediction and Annotation of Small Open Reading Frames in Microbial Genomes
被引:24
|
作者:
Durrant, Matthew G.
论文数: 0引用数: 0
h-index: 0
机构:
Stanford Univ, Dept Genet, Stanford, CA 94305 USA
Stanford Univ, Dept Med Hematol Blood & Marrow Transplantat, Stanford, CA 94305 USAStanford Univ, Dept Genet, Stanford, CA 94305 USA
Durrant, Matthew G.
[1
,2
]
Bhatt, Ami S.
论文数: 0引用数: 0
h-index: 0
机构:
Stanford Univ, Dept Genet, Stanford, CA 94305 USA
Stanford Univ, Dept Med Hematol Blood & Marrow Transplantat, Stanford, CA 94305 USAStanford Univ, Dept Genet, Stanford, CA 94305 USA
Bhatt, Ami S.
[1
,2
]
机构:
[1] Stanford Univ, Dept Genet, Stanford, CA 94305 USA
[2] Stanford Univ, Dept Med Hematol Blood & Marrow Transplantat, Stanford, CA 94305 USA
Small open reading frames (smORFs) and their encoded microproteins play central roles in microbes. However, there is a vast unexplored space of smORFs within human-associated microbes. A recent bioinformatic analysis used evolutionary conservation signals to enhance prediction of small protein families. To facilitate the annotation of specific smORFs, we introduce SmORFinder. This tool combines profile hidden Markov models of each smORF family and deep learning models that better generalize to smORF families not seen in the training set, resulting in predictions enriched for Ribo-seq translation signals. Feature importance analysis reveals that the deep learning models learn to identify Shine-Dalgarno sequences, deprioritize the wobble position in each codon, and group codon synonyms found in the codon table. A core-genome analysis of 26 bacterial species identifies several core smORFs of unknown function. We pre-compute smORF annotations for thousands of RefSeq isolate genomes and Human Microbiome Project metagenomes and provide these data through a public web portal.
机构:
Salk Inst Biol Studies, Clayton Fdn Labs Peptide Biol, 10010 N Torrey Pines Rd, La Jolla, CA 92037 USASalk Inst Biol Studies, Clayton Fdn Labs Peptide Biol, 10010 N Torrey Pines Rd, La Jolla, CA 92037 USA
Martinez, Thomas F.
Chu, Qian
论文数: 0引用数: 0
h-index: 0
机构:
Salk Inst Biol Studies, Clayton Fdn Labs Peptide Biol, 10010 N Torrey Pines Rd, La Jolla, CA 92037 USASalk Inst Biol Studies, Clayton Fdn Labs Peptide Biol, 10010 N Torrey Pines Rd, La Jolla, CA 92037 USA
Chu, Qian
Donaldson, Cynthia
论文数: 0引用数: 0
h-index: 0
机构:
Salk Inst Biol Studies, Clayton Fdn Labs Peptide Biol, 10010 N Torrey Pines Rd, La Jolla, CA 92037 USASalk Inst Biol Studies, Clayton Fdn Labs Peptide Biol, 10010 N Torrey Pines Rd, La Jolla, CA 92037 USA
Donaldson, Cynthia
Tan, Dan
论文数: 0引用数: 0
h-index: 0
机构:
Salk Inst Biol Studies, Clayton Fdn Labs Peptide Biol, 10010 N Torrey Pines Rd, La Jolla, CA 92037 USASalk Inst Biol Studies, Clayton Fdn Labs Peptide Biol, 10010 N Torrey Pines Rd, La Jolla, CA 92037 USA
Tan, Dan
Shokhirev, Maxim N.
论文数: 0引用数: 0
h-index: 0
机构:
Salk Inst Biol Studies, Razavi Newman Integrat Genom Bioinformat Core, 10010 N Torrey Pines Rd, La Jolla, CA 92037 USASalk Inst Biol Studies, Clayton Fdn Labs Peptide Biol, 10010 N Torrey Pines Rd, La Jolla, CA 92037 USA
Shokhirev, Maxim N.
Saghatelian, Alan
论文数: 0引用数: 0
h-index: 0
机构:
Salk Inst Biol Studies, Clayton Fdn Labs Peptide Biol, 10010 N Torrey Pines Rd, La Jolla, CA 92037 USASalk Inst Biol Studies, Clayton Fdn Labs Peptide Biol, 10010 N Torrey Pines Rd, La Jolla, CA 92037 USA
机构:
Univ Fed Rio de Janeiro, Inst Biodivers & Sustainabil, Rio De Janeiro, BrazilUniv Fed Rio de Janeiro, Inst Biodivers & Sustainabil, Rio De Janeiro, Brazil
Guerra-Almeida, Diego
Nunes-da-Fonseca, Rodrigo
论文数: 0引用数: 0
h-index: 0
机构:
Univ Fed Rio de Janeiro, Inst Biodivers & Sustainabil, Rio De Janeiro, Brazil
Natl Inst Sci & Technol Mol Entomol, Rio De Janeiro, BrazilUniv Fed Rio de Janeiro, Inst Biodivers & Sustainabil, Rio De Janeiro, Brazil
机构:
Univ Pompeu Fabra, Evolutionary Genom Grp, Res Programme Biomed Informat, Hosp del Mar Res Inst, Barcelona, SpainUniv Pompeu Fabra, Evolutionary Genom Grp, Res Programme Biomed Informat, Hosp del Mar Res Inst, Barcelona, Spain
Ruiz-Orera, Jorge
Alba, M. Mar
论文数: 0引用数: 0
h-index: 0
机构:
Univ Pompeu Fabra, Evolutionary Genom Grp, Res Programme Biomed Informat, Hosp del Mar Res Inst, Barcelona, Spain
Catalan Inst Res & Adv Studies, Barcelona, SpainUniv Pompeu Fabra, Evolutionary Genom Grp, Res Programme Biomed Informat, Hosp del Mar Res Inst, Barcelona, Spain
机构:
Lab Synthet Chem & Chem Biol Ltd, Hong Kong 999077, Peoples R ChinaLab Synthet Chem & Chem Biol Ltd, Hong Kong 999077, Peoples R China
Chen, Lei
Zhang, Yuanliang
论文数: 0引用数: 0
h-index: 0
机构:
Hong Kong Polytech Univ, Dept Appl Biol & Chem Technol, State Key Lab Chem Biol & Drug Discovery, Hong Kong 999077, Peoples R ChinaLab Synthet Chem & Chem Biol Ltd, Hong Kong 999077, Peoples R China
Zhang, Yuanliang
Yang, Ying
论文数: 0引用数: 0
h-index: 0
机构:
Hong Kong Polytech Univ, Dept Appl Biol & Chem Technol, State Key Lab Chem Biol & Drug Discovery, Hong Kong 999077, Peoples R ChinaLab Synthet Chem & Chem Biol Ltd, Hong Kong 999077, Peoples R China
Yang, Ying
Yang, Yang
论文数: 0引用数: 0
h-index: 0
机构:
Hong Kong Polytech Univ, Dept Appl Biol & Chem Technol, State Key Lab Chem Biol & Drug Discovery, Hong Kong 999077, Peoples R ChinaLab Synthet Chem & Chem Biol Ltd, Hong Kong 999077, Peoples R China
Yang, Yang
Li, Huihui
论文数: 0引用数: 0
h-index: 0
机构:
Sun Yat Sen Univ, Zhongshan Ophthalm Ctr, State Key Lab Ophthalmol, Guangzhou 510623, Peoples R ChinaLab Synthet Chem & Chem Biol Ltd, Hong Kong 999077, Peoples R China
Li, Huihui
Dong, Xuan
论文数: 0引用数: 0
h-index: 0
机构:
BGI Shenzhen, Shenzhen 518083, Peoples R ChinaLab Synthet Chem & Chem Biol Ltd, Hong Kong 999077, Peoples R China
Dong, Xuan
Wang, Hongwei
论文数: 0引用数: 0
h-index: 0
机构:
Sun Yat Sen Univ, Zhongshan Ophthalm Ctr, State Key Lab Ophthalmol, Guangzhou 510623, Peoples R ChinaLab Synthet Chem & Chem Biol Ltd, Hong Kong 999077, Peoples R China
Wang, Hongwei
Xie, Zhi
论文数: 0引用数: 0
h-index: 0
机构:
Sun Yat Sen Univ, Zhongshan Ophthalm Ctr, State Key Lab Ophthalmol, Guangzhou 510623, Peoples R ChinaLab Synthet Chem & Chem Biol Ltd, Hong Kong 999077, Peoples R China
Xie, Zhi
Zhao, Qian
论文数: 0引用数: 0
h-index: 0
机构:
Hong Kong Polytech Univ, Dept Appl Biol & Chem Technol, State Key Lab Chem Biol & Drug Discovery, Hong Kong 999077, Peoples R ChinaLab Synthet Chem & Chem Biol Ltd, Hong Kong 999077, Peoples R China
机构:
RIKEN, Ctr Sustainable Resource Sci, Plant Genom Network Res Team, Tsurumi Ku, Suehiro Cho, Yokohama, Kanagawa, Japan
Yokohama City Univ, Kihara Inst Biol Res, Yokohama, Kanagawa, JapanRIKEN, Ctr Sustainable Resource Sci, Plant Genom Network Res Team, Tsurumi Ku, Suehiro Cho, Yokohama, Kanagawa, Japan
Rasheed, Sultana
Bashir, Khurram
论文数: 0引用数: 0
h-index: 0
机构:
RIKEN, Ctr Sustainable Resource Sci, Plant Genom Network Res Team, Tsurumi Ku, Suehiro Cho, Yokohama, Kanagawa, JapanRIKEN, Ctr Sustainable Resource Sci, Plant Genom Network Res Team, Tsurumi Ku, Suehiro Cho, Yokohama, Kanagawa, Japan
Bashir, Khurram
Nakaminami, Kentaro
论文数: 0引用数: 0
h-index: 0
机构:
RIKEN, Ctr Sustainable Resource Sci, Plant Genom Network Res Team, Tsurumi Ku, Suehiro Cho, Yokohama, Kanagawa, JapanRIKEN, Ctr Sustainable Resource Sci, Plant Genom Network Res Team, Tsurumi Ku, Suehiro Cho, Yokohama, Kanagawa, Japan
Nakaminami, Kentaro
Hanada, Kousuke
论文数: 0引用数: 0
h-index: 0
机构:
Kyushu Inst Technol, Frontier Res Acad Young Researchers, Fukuoka, Japan
JST, CREST, Honcho, Kawaguchi, Saitama, JapanRIKEN, Ctr Sustainable Resource Sci, Plant Genom Network Res Team, Tsurumi Ku, Suehiro Cho, Yokohama, Kanagawa, Japan
Hanada, Kousuke
Matsui, Akihiro
论文数: 0引用数: 0
h-index: 0
机构:
RIKEN, Ctr Sustainable Resource Sci, Plant Genom Network Res Team, Tsurumi Ku, Suehiro Cho, Yokohama, Kanagawa, JapanRIKEN, Ctr Sustainable Resource Sci, Plant Genom Network Res Team, Tsurumi Ku, Suehiro Cho, Yokohama, Kanagawa, Japan
Matsui, Akihiro
Seki, Motoaki
论文数: 0引用数: 0
h-index: 0
机构:
RIKEN, Ctr Sustainable Resource Sci, Plant Genom Network Res Team, Tsurumi Ku, Suehiro Cho, Yokohama, Kanagawa, Japan
Yokohama City Univ, Kihara Inst Biol Res, Yokohama, Kanagawa, Japan
JST, CREST, Honcho, Kawaguchi, Saitama, JapanRIKEN, Ctr Sustainable Resource Sci, Plant Genom Network Res Team, Tsurumi Ku, Suehiro Cho, Yokohama, Kanagawa, Japan
机构:
Cornell Univ, Inst Genom Divers, Ithaca, NY 14853 USA
North Carolina State Univ, Dept Crop & Soil Sci, Raleigh, NC 27695 USACornell Univ, Inst Genom Divers, Ithaca, NY 14853 USA
Gage, Joseph L.
Mali, Sujina
论文数: 0引用数: 0
h-index: 0
机构:
Washington Univ, Dept Biol, St Louis, MO 63130 USACornell Univ, Inst Genom Divers, Ithaca, NY 14853 USA
Mali, Sujina
McLoughlin, Fionn
论文数: 0引用数: 0
h-index: 0
机构:
Washington Univ, Dept Biol, St Louis, MO 63130 USACornell Univ, Inst Genom Divers, Ithaca, NY 14853 USA
McLoughlin, Fionn
Khaipho-Burch, Merritt
论文数: 0引用数: 0
h-index: 0
机构:
Cornell Univ, Sch Integrat Plant Sci, Plant Breeding & Genet Sect, Ithaca, NY 14853 USACornell Univ, Inst Genom Divers, Ithaca, NY 14853 USA
Khaipho-Burch, Merritt
Monier, Brandon
论文数: 0引用数: 0
h-index: 0
机构:
Cornell Univ, Inst Genom Divers, Ithaca, NY 14853 USACornell Univ, Inst Genom Divers, Ithaca, NY 14853 USA
Monier, Brandon
Bailey-Serres, Julia
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif Riverside, Ctr Plant Cell Biol, Dept Bot & Plant Sci, Riverside, CA 92521 USACornell Univ, Inst Genom Divers, Ithaca, NY 14853 USA
Bailey-Serres, Julia
Vierstra, Richard D.
论文数: 0引用数: 0
h-index: 0
机构:
Washington Univ, Dept Biol, St Louis, MO 63130 USACornell Univ, Inst Genom Divers, Ithaca, NY 14853 USA
Vierstra, Richard D.
Buckler, Edward S.
论文数: 0引用数: 0
h-index: 0
机构:
Cornell Univ, Inst Genom Divers, Ithaca, NY 14853 USA
Cornell Univ, Sch Integrat Plant Sci, Plant Breeding & Genet Sect, Ithaca, NY 14853 USA
ARS, USDA, Ithaca, NY 14853 USACornell Univ, Inst Genom Divers, Ithaca, NY 14853 USA