Genome-wide discovery of miRNAs using ensembles of machine learning algorithms and logistic regression

被引:5
|
作者
Ulfenborg, Benjamin [1 ]
Klinga-Levan, Karin [1 ]
Olsson, Bjorn [1 ]
机构
[1] Univ Skovde, Sch Biosci, Syst Biol Res Ctr, Skovde, Sweden
关键词
miRNA prediction; miRNA discovery; RNA structure prediction; GenoScan; ensemble classifier; regression model; machine learning; RNA SECONDARY STRUCTURE; WEB SERVER; COMPUTATIONAL IDENTIFICATION; MICRORNA; PREDICTION; CLASSIFICATION; PRECURSORS; SOFTWARE; TOOL; SEQUENCES;
D O I
10.1504/IJDMB.2015.072755
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
In silico prediction of novel miRNAs from genomic sequences remains a challenging problem. This study presents a genome-wide miRNA discovery software package called GenoScan and evaluates two hairpin classification methods. These methods, one ensemble-based and one using logistic regression were benchmarked along with 15 published methods. In addition, the sequence-folding step is addressed by investigating the impact of secondary structure prediction methods and the choice of input sequence length on prediction performance. Both the accuracy of secondary structure predictions and the miRNA prediction are evaluated. In the benchmark of hairpin classification methods, the regression model achieved highest classification accuracy. Of the structure prediction methods evaluated, ContextFold achieved the highest agreement between predicted and experimentally determined structures. However, both the choice of secondary structure prediction method and input sequence length had limited impact on hairpin classification performance.
引用
收藏
页码:338 / 359
页数:22
相关论文
共 50 条
  • [31] Multilocus Analysis of Genome-wide Association (GWA) Studies by Applying Random Forests and Logistic Regression
    Parisi, R.
    Bishop, D. T.
    Iles, M. M.
    Barrett, J. H.
    GENETIC EPIDEMIOLOGY, 2008, 32 (07) : 710 - 710
  • [32] Saddlepoint approximations to score test statistics in logistic regression for analyzing genome-wide association studies
    Johnsen, Pal V.
    Bakke, Oyvind
    Bjornland, Thea
    DeWan, Andrew Thomas
    Langaas, Mette
    STATISTICS IN MEDICINE, 2023, 42 (16) : 2746 - 2759
  • [33] Selecting Machine Learning Algorithms using Regression Models
    Doan, Tri
    Kalita, Jugal
    2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2015, : 1498 - 1505
  • [34] Genome-wide analysis for discovery of rice microRNAs reveals natural antisense microRNAs (nat-miRNAs)
    Lu, Cheng
    Jeong, Dong-Hoon
    Kulkarni, Karthik
    Pillay, Manoj
    Nobuta, Kan
    German, Rana
    Thatcher, Shawn R.
    Maher, Christopher
    Zhang, Lifang
    Ware, Doreen
    Liu, Bin
    Cao, Xiaofeng
    Meyers, Blake C.
    Green, Pamela J.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2008, 105 (12) : 4951 - 4956
  • [35] Genome-Wide Discovery of miRNAs with Differential Expression Patterns in Responses to Salinity in the Two Contrasting Wheat Cultivars
    Zeeshan, Muhammad
    Qiu, Cheng-Wei
    Naz, Shama
    Cao, Fangbin
    Wu, Feibo
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2021, 22 (22)
  • [36] Computer vision and machine learning for robust phenotyping in genome-wide studies
    Zhang, Jiaoping
    Naik, Hsiang Sing
    Assefa, Teshale
    Sarkar, Soumik
    Reddy, R. V. Chowda
    Singh, Arti
    Ganapathysubramanian, Baskar
    Singh, Asheesh K.
    SCIENTIFIC REPORTS, 2017, 7
  • [37] Computer vision and machine learning for robust phenotyping in genome-wide studies
    Jiaoping Zhang
    Hsiang Sing Naik
    Teshale Assefa
    Soumik Sarkar
    R. V. Chowda Reddy
    Arti Singh
    Baskar Ganapathysubramanian
    Asheesh K. Singh
    Scientific Reports, 7
  • [38] Feasibility of Machine Learning and Logistic Regression Algorithms to Predict Outcome in Orthopaedic Trauma Surgery
    Oosterhoff, Jacobien H. F.
    Gravesteijn, Benjamin Y.
    Karhade, Aditya V.
    Jaarsma, Ruurd L.
    Kerkhoffs, Gino M. M. J.
    Ring, David
    Schwab, Joseph H.
    Steyerberg, Ewout W.
    Doornberg, Job N.
    JOURNAL OF BONE AND JOINT SURGERY-AMERICAN VOLUME, 2022, 104 (06): : 544 - 551
  • [39] Performances of several machine learning algorithms and of logistic regression to predict Fasciola hepatica in cattle
    Ergin, Malik
    Koskan, Oezguer
    PESQUISA AGROPECUARIA BRASILEIRA, 2024, 59
  • [40] The Impact of Undersampling on the Predictive Performance of Logistic Regression and Machine Learning Algorithms A Simulation Study
    Cartus, Abigail R.
    Bodnar, Lisa M.
    Naimi, Ashley I.
    EPIDEMIOLOGY, 2020, 31 (05) : E42 - E44