A framework for improving microRNA prediction in non-human genomes

被引:34
|
作者
Peace, Robert J. [1 ]
Biggar, Kyle K. [2 ,3 ,4 ]
Storey, Kenneth B. [2 ,3 ]
Green, James R. [1 ]
机构
[1] Carleton Univ, Dept Syst & Comp Engn, Ottawa, ON K1S 5B6, Canada
[2] Carleton Univ, Inst Biochem, Ottawa, ON K1S 5B6, Canada
[3] Carleton Univ, Dept Biol, Ottawa, ON K1S 5B6, Canada
[4] Univ Western Ontario, Dept Biochem, London, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
SUPPORT VECTOR MACHINE; EFFECTIVE CLASSIFICATION; RANDOM FOREST; PRECURSORS; IDENTIFICATION; EFFICIENT; SELECTION; SEQUENCE; FEATURES; REGIONS;
D O I
10.1093/nar/gkv698
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The prediction of novel pre-microRNA (miRNA) from genomic sequence has received considerable attention recently. However, the majority of studies have focused on the human genome. Previous studies have demonstrated that sensitivity (correctly detecting true miRNA) is sustained when human-trained methods are applied to other species, however they have failed to report the dramatic drop in specificity (the ability to correctly reject non-miRNA sequences) in non-human genomes. Considering the ratio of true miRNA sequences to pseudo-miRNA sequences is on the order of 1:1000, such low specificity prevents the application of most existing tools to non-human genomes, as the number of false positives overwhelms the true predictions. We here introduce a framework (SMIRP) for creating species-specific miRNA prediction systems, leveraging sequence conservation and phylogenetic distance information. Substantial improvements in specificity and precision are obtained for four non-human test species when our framework is applied to three different prediction systems representing two types of classifiers (support vector machine and Random Forest), based on three different feature sets, with both human-specific and taxon-wide training data. The SMIRP framework is potentially applicable to all miRNA prediction systems and we expect substantial improvement in precision and specificity, while sustaining sensitivity, independent of the machine learning technique chosen.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] An Improved Non-Comparative Classification Method for Human microRNA Gene Prediction
    Batuwita, Rukshan
    Palade, Vasile
    8TH IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING, VOLS 1 AND 2, 2008, : 456 - 461
  • [2] Improving microRNA target prediction with gene expression profiles
    Ovando-Vazquez, Cesare
    Lepe-Soltero, Daniel
    Abreu-Goodger, Cei
    BMC GENOMICS, 2016, 17
  • [3] In search of animal normativity: a framework for studying social norms in non-human animals
    Westra, Evan
    Fitzpatrick, Simon
    Brosnan, Sarah F.
    Gruber, Thibaud
    Hobaiter, Catherine
    Hopper, Lydia M.
    Kelly, Daniel
    Krupenye, Christopher
    Luncz, Lydia V.
    Theriault, Jordan
    Andrews, Kristin
    BIOLOGICAL REVIEWS, 2024, 99 (03) : 1058 - 1074
  • [4] Improving performance of mammalian microRNA target prediction
    Hui Liu
    Dong Yue
    Yidong Chen
    Shou-Jiang Gao
    Yufei Huang
    BMC Bioinformatics, 11
  • [5] Improving performance of mammalian microRNA target prediction
    Liu, Hui
    Yue, Dong
    Chen, Yidong
    Gao, Shou-Jiang
    Huang, Yufei
    BMC BIOINFORMATICS, 2010, 11
  • [6] Analysis of twelve genomes of the bacterium Kerstersia gyiorum from brown-throated sloths (Bradypus variegatus), the first from a non-human host
    Carhuaricra-Huaman, Dennis
    Gonzalez, Irys H. L.
    Ramos, Patricia L.
    da Silva, Aline M.
    Setubal, Joao C.
    PEERJ, 2024, 12
  • [7] Improving microRNA target prediction with gene expression profiles
    Cesaré Ovando-Vázquez
    Daniel Lepe-Soltero
    Cei Abreu-Goodger
    BMC Genomics, 17
  • [8] Nomenclature for the KIR of non-human species
    Robinson, James
    Guethlein, Lisbeth A.
    Maccari, Giuseppe
    Blokhuis, Jeroen
    Bimber, Benjamin N.
    de Groot, Natasja G.
    Sanderson, Nicholas D.
    Abi-Rached, Laurent
    Walter, Lutz
    Bontrop, Ronald E.
    Hammond, John A.
    Marsh, Steven G. E.
    Parham, Peter
    IMMUNOGENETICS, 2018, 70 (09) : 571 - 583
  • [9] Improving Bioinformatics Prediction of microRNA Targets by Ranks Aggregation
    Quillet, Aurelien
    Saad, Chadi
    Ferry, Gaetan
    Anouar, Youssef
    Vergne, Nicolas
    Lecroq, Thierry
    Dubessy, Christophe
    FRONTIERS IN GENETICS, 2020, 10
  • [10] Nomenclature for the KIR of non-human species
    James Robinson
    Lisbeth A. Guethlein
    Giuseppe Maccari
    Jeroen Blokhuis
    Benjamin N. Bimber
    Natasja G. de Groot
    Nicholas D. Sanderson
    Laurent Abi-Rached
    Lutz Walter
    Ronald E. Bontrop
    John A. Hammond
    Steven G. E. Marsh
    Peter Parham
    Immunogenetics, 2018, 70 : 571 - 583