A framework for improving microRNA prediction in non-human genomes

被引:34
作者
Peace, Robert J. [1 ]
Biggar, Kyle K. [2 ,3 ,4 ]
Storey, Kenneth B. [2 ,3 ]
Green, James R. [1 ]
机构
[1] Carleton Univ, Dept Syst & Comp Engn, Ottawa, ON K1S 5B6, Canada
[2] Carleton Univ, Inst Biochem, Ottawa, ON K1S 5B6, Canada
[3] Carleton Univ, Dept Biol, Ottawa, ON K1S 5B6, Canada
[4] Univ Western Ontario, Dept Biochem, London, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
SUPPORT VECTOR MACHINE; EFFECTIVE CLASSIFICATION; RANDOM FOREST; PRECURSORS; IDENTIFICATION; EFFICIENT; SELECTION; SEQUENCE; FEATURES; REGIONS;
D O I
10.1093/nar/gkv698
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The prediction of novel pre-microRNA (miRNA) from genomic sequence has received considerable attention recently. However, the majority of studies have focused on the human genome. Previous studies have demonstrated that sensitivity (correctly detecting true miRNA) is sustained when human-trained methods are applied to other species, however they have failed to report the dramatic drop in specificity (the ability to correctly reject non-miRNA sequences) in non-human genomes. Considering the ratio of true miRNA sequences to pseudo-miRNA sequences is on the order of 1:1000, such low specificity prevents the application of most existing tools to non-human genomes, as the number of false positives overwhelms the true predictions. We here introduce a framework (SMIRP) for creating species-specific miRNA prediction systems, leveraging sequence conservation and phylogenetic distance information. Substantial improvements in specificity and precision are obtained for four non-human test species when our framework is applied to three different prediction systems representing two types of classifiers (support vector machine and Random Forest), based on three different feature sets, with both human-specific and taxon-wide training data. The SMIRP framework is potentially applicable to all miRNA prediction systems and we expect substantial improvement in precision and specificity, while sustaining sensitivity, independent of the machine learning technique chosen.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Hepatitis B virus infection in non-human primates
    Sa-Nguanmoo, P.
    Rianthavorn, P.
    Amornsawadwattana, S.
    Poovorawan, Y.
    ACTA VIROLOGICA, 2009, 53 (02) : 73 - 82
  • [22] Prediction of viral microRNA precursors based on human microRNA precursor sequence and structural features
    Shiva Kumar
    Faraz A Ansari
    Vinod Scaria
    Virology Journal, 6
  • [23] Population genetic analysis of Chadian Guinea worms reveals that human and non-human hosts share common parasite populations
    Thiele, Elizabeth A.
    Eberhard, Mark L.
    Cotton, James A.
    Durrant, Caroline
    Berg, Jeffrey
    Hamm, Kelsey
    Ruiz-Tiben, Ernesto
    PLOS NEGLECTED TROPICAL DISEASES, 2018, 12 (10):
  • [24] Detection of Mycobacterium avium subsp paratuberculosis in non-human primates
    Fechner, Kim
    Maetz-Rensing, Kerstin
    Lampe, Karen
    Kaup, Franz-Josef
    Czerny, Claus-Peter
    Schaefer, Jenny
    JOURNAL OF MEDICAL PRIMATOLOGY, 2017, 46 (05) : 211 - 217
  • [25] Obtention and Engineering of Non-Human Primate (NHP) Antibodies for Therapeutics
    Pelat, Thibaut
    Hust, Michael
    Thullier, Philippe
    MINI-REVIEWS IN MEDICINAL CHEMISTRY, 2009, 9 (14) : 1633 - 1638
  • [26] Explainable Artificial Intelligence Based Framework for Non-Communicable Diseases Prediction
    Davagdorj, Khishigsuren
    Bae, Jang-Whan
    Pham, Van-Huy
    Theera-Umpon, Nipon
    Ryu, Keun Ho
    IEEE ACCESS, 2021, 9 : 123672 - 123688
  • [27] MTar: a computational microRNA target prediction architecture for human transcriptome
    Chandra, Vinod
    Girijadevi, Reshmi
    Nair, Achuthsankar S.
    Pillai, Sreenadhan S.
    Pillai, Radhakrishna M.
    BMC BIOINFORMATICS, 2010, 11
  • [28] Mapping the antibody response to Lassa virus vaccination of non-human primates
    Enriquez, Adrian S.
    Avalos, Ruben Diaz
    Parekh, Diptiben
    Cooper, Christopher L.
    Morrow, Gavin
    Geisbert, Thomas W.
    Parks, Christopher L.
    Hastie, Kathryn M.
    Saphire, Erica Ollmann
    EBIOMEDICINE, 2025, 114
  • [29] Accuracy of MicroRNA Discovery Pipelines in Non-Model Organisms Using Closely Related Species Genomes
    Etebari, Kayvan
    Asgari, Sassan
    PLOS ONE, 2014, 9 (01):
  • [30] Discovery and genetic characterization of diverse smacoviruses in Zambian non-human primates
    Anindita, Pauline D.
    Sasaki, Michihito
    Gonzalez, Gabriel
    Phongphaew, Walleye
    Carr, Michael
    Hang'ombe, Bernard M.
    Mweene, Aaron S.
    Ito, Kimihito
    Orba, Yasuko
    Sawa, Hirofumi
    SCIENTIFIC REPORTS, 2019, 9