Sequence-based heuristics for faster annotation of non-coding RNA families

被引:60
|
作者
Weinberg, Z [1 ]
Ruzzo, WL
机构
[1] Univ Washington, Dept Comp Sci & Engn, Seattle, WA 98195 USA
[2] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
关键词
D O I
10.1093/bioinformatics/bti743
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Non-coding RNAs (ncRNAs) are functional RNA molecules that do not code for proteins. Covariance Models (CMs) are a useful statistical tool to find new members of an ncRNA gene family in a large genome database, using both sequence and, importantly, RNA secondary structure information. Unfortunately, CM searches are extremely slow. Previously, we created rigorous filters, which provably sacrifice none of a CM's accuracy, while making searches significantly faster for virtually all ncRNA families. However, these rigorous filters make searches slower than heuristics could be. Results: In this paper we introduce profile HMM-based heuristic filters. We show that their accuracy is usually superior to heuristics based on BLAST. Moreover, we compared our heuristics with those used in tRNAscan-SE, whose heuristics incorporate a significant amount of work specific to tRNAs, where our heuristics are generic to any ncRNA. Performance was roughly comparable, so we expect that our heuristics provide a high-quality solution that-unlike family-specific solutions-can scale to hundreds of ncRNA families.
引用
收藏
页码:35 / 39
页数:5
相关论文
共 50 条
  • [41] Non-coding RNA RNA-based mechanism in cardiovascular ageing
    Boon, R.
    ZEITSCHRIFT FUR GERONTOLOGIE UND GERIATRIE, 2022, 55 (SUPPL 1): : 35 - 35
  • [42] Computational approaches and challenges for identification and annotation of non-coding RNAs using RNA-Seq
    Dindhoria, Kiran
    Monga, Isha
    Thind, Amarinder Singh
    FUNCTIONAL & INTEGRATIVE GENOMICS, 2022, 22 (06) : 1105 - 1112
  • [43] 5' TERMINAL NON-CODING SEQUENCE HETEROGENEITY IN REOVIRUS MESSENGER-RNA
    HASTINGS, KEM
    MILLWARD, S
    NUCLEIC ACIDS RESEARCH, 1977, 4 (01) : 195 - 205
  • [44] Computational approaches and challenges for identification and annotation of non-coding RNAs using RNA-Seq
    Kiran Dindhoria
    Isha Monga
    Amarinder Singh Thind
    Functional & Integrative Genomics, 2022, 22 : 1105 - 1112
  • [45] Computational methods for annotation of plant regulatory non-coding RNAs using RNA-seq
    Vivek, A. T.
    Kumar, Shailesh
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (04)
  • [46] MMnc: multi-modal interpretable representation for non-coding RNA classification and class annotation
    Creux, Constance
    Zehraoui, Farida
    Radvanyi, Francois
    Tahi, Fariza
    BIOINFORMATICS, 2025, 41 (03)
  • [47] Automated sequence-based annotation and interpretation of the human genome
    Anshul Kundaje
    Wouter Meuleman
    Nature Genetics, 2022, 54 : 916 - 917
  • [48] Automated sequence-based annotation and interpretation of the human genome
    Kundaje, Anshul
    Meuleman, Wouter
    NATURE GENETICS, 2022, 54 (07) : 916 - 917
  • [49] Long Non-Coding RNA in Cancer
    Hauptman, Nina
    Glavac, Damjan
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2013, 14 (03) : 4655 - 4669
  • [50] Non-coding RNA in infantile hemangioma
    Wang, Qizhang
    Zhao, Chengzhi
    Du, Qianxin
    Cao, Zhiwei
    Pan, Jian
    PEDIATRIC RESEARCH, 2024, 96 (07) : 1594 - 1602