RNA secondary structure prediction from sequence alignments using a network of k-nearest neighbor classifiers

被引:56
作者
Bindewald, E
Shapiro, BA
机构
[1] NCI, Ctr Canc Res, Nanobiol Program, Frederick, MD 21702 USA
[2] SAIC Frederick Inc, Basic Res Program, Frederick, MD USA
关键词
RNA; secondary structure; mutual information; machine learning; alignment;
D O I
10.1261/rna.2164906
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We present a machine learning method (a hierarchical network of k-nearest neighbor classifiers) that uses an RNA sequence alignment in order to predict a consensus RNA secondary structure. The input to the network is the mutual information, the fraction of complementary nucleotides, and a novel consensus RNAfold secondary structure prediction of a pair of alignment columns and its nearest neighbors. Given this input, the network computes a prediction as to whether a particular pair of alignment columns corresponds to a base pair. By using a comprehensive test set of 49 RFAM alignments, the program KNetFold achieves an average Matthews correlation coefficient of 0.81. This is a significant improvement compared with the secondary structure prediction methods PFOLD and RNAalifold. By using the example of archaeal RNase P, we show that the program can also predict pseudoknot interactions.
引用
收藏
页码:342 / 352
页数:11
相关论文
共 45 条
  • [1] Phylogenetically enhanced statistical tools for RNA structure prediction
    Akmaev, VR
    Kelley, ST
    Stormo, GD
    [J]. BIOINFORMATICS, 2000, 16 (06) : 501 - 512
  • [2] ARYA S, 1993, P DCC 93 DAT COMPR C, P381
  • [3] Assessing the accuracy of prediction algorithms for classification: an overview
    Baldi, P
    Brunak, S
    Chauvin, Y
    Andersen, CAF
    Nielsen, H
    [J]. BIOINFORMATICS, 2000, 16 (05) : 412 - 424
  • [4] Basharin G. P., 1959, THEOR PROBAB APPL, V4, P333
  • [5] The Ribonuclease P Database
    Brown, JW
    [J]. NUCLEIC ACIDS RESEARCH, 1999, 27 (01) : 314 - 314
  • [6] RNA secondary structure and compensatory evolution
    Chen, Y
    Carlini, DB
    Baines, JF
    Parsch, J
    Braverman, JM
    Tanda, S
    Stephan, W
    [J]. GENES & GENETIC SYSTEMS, 1999, 74 (06) : 271 - 286
  • [7] Prediction of contact maps with neural networks and correlated mutations
    Fariselli, P
    Olmea, O
    Valencia, A
    Casadio, R
    [J]. PROTEIN ENGINEERING, 2001, 14 (11): : 835 - 843
  • [8] Freund Y, 1996, ICML
  • [9] A comprehensive comparison of comparative RNA structure prediction approaches
    Gardner, PP
    Giegerich, R
    [J]. BMC BIOINFORMATICS, 2004, 5 (1)
  • [10] Discovering common stem-loop motifs in unaligned RNA sequences
    Gorodkin, J
    Stricklin, SL
    Stormo, GD
    [J]. NUCLEIC ACIDS RESEARCH, 2001, 29 (10) : 2135 - 2144