Predicting Large RNA-Like Topologies by a Knowledge-Based Clustering Approach

被引:15
作者
Baba, Naoto [1 ,2 ,3 ]
Elmetwaly, Shereef [1 ,2 ]
Kim, Namhee [1 ,2 ]
Schlick, Tamar [1 ,2 ,4 ]
机构
[1] NYU, Dept Chem, 251 Mercer St, New York, NY 10012 USA
[2] NYU, Courant Inst Math Sci, 251 Mercer St, New York, NY 10012 USA
[3] Nagoya Univ, Dept Chem, Chikusa Ku, Furo Cho, Nagoya, Aichi 4648601, Japan
[4] NYU Shanghai, NYU ECNU Ctr Computat Chem, 3663 Zhongshan Rd North, Shanghai 200062, Peoples R China
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
RNA secondary structure; RNA atlas; RNA motifs; RNA design; Prediction of RNA-like motifs; STRUCTURAL GENOMICS; DATABASE; GRAPHS; POOLS;
D O I
10.1016/j.jmb.2015.10.009
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
An analysis and expansion of our resource for classifying, predicting, and designing RNA structures, RAG (RNA-As-Graphs), is presented, with the goal of understanding features of RNA-like and non-RNA-like motifs and exploiting this information for RNA design. RAG was first reported in 2004 for cataloging RNA secondary structure motifs using graph representations. In 2011, the RAG resource was updated with the increased availability of RNA structures and was improved by utilities for analyzing RNA structures, including substructuring and search tools. We also classified RNA structures as graphs up to 10 vertices (similar to 200 nucleotides) into three classes: existing, RNA-like, and non-RNA-like using clustering approaches. Here, we focus on the tree graphs and evaluate the newly founded RNAs since 2011, which also support our refined predictions of RNA-like motifs. We expand the RAG resource for large tree graphs up to 13 vertices (similar to 260 nucleotides), thereby cataloging more than 10 times as many secondary structures. We apply clustering algorithms based on features of RNA secondary structures translated from known tertiary structures to suggest which hypothetical large RNA motifs can be considered "RNA-like". The results by the PAM (Partitioning Around Medoids) approach, in particular, reveal good accuracy, with small error for the largest cases. The RAG update here up to 13 vertices offers a useful graph-based tool for exploring RNA motifs and suggesting large RNA motifs for design. (C) 2015 Elsevier Ltd. All rights reserved.
引用
收藏
页码:811 / 821
页数:11
相关论文
共 57 条
[1]   RNA STRAND: The RNA secondary structure and statistical analysis database [J].
Andronescu, Mirela ;
Bereg, Vera ;
Hoos, Holger H. ;
Condon, Anne .
BMC BIOINFORMATICS, 2008, 9 (1)
[2]  
[Anonymous], 2002, Molecular Modeling and Simulation
[3]   Large Deviations for Random Trees and the Branching of RNA Secondary Structures [J].
Bakhtin, Yuri ;
Heitsch, Christine E. .
BULLETIN OF MATHEMATICAL BIOLOGY, 2009, 71 (01) :84-106
[4]   A graph-topological approach to recognition of pattern and similarity in RNA secondary structures [J].
Benedetti, G ;
Morosetti, S .
BIOPHYSICAL CHEMISTRY, 1996, 59 (1-2) :179-184
[5]   THE NUCLEIC-ACID DATABASE - A COMPREHENSIVE RELATIONAL DATABASE OF 3-DIMENSIONAL STRUCTURES OF NUCLEIC-ACIDS [J].
BERMAN, HM ;
OLSON, WK ;
BEVERIDGE, DL ;
WESTBROOK, J ;
GELBIN, A ;
DEMENY, T ;
HSIEH, SH ;
SRINIVASAN, AR ;
SCHNEIDER, B .
BIOPHYSICAL JOURNAL, 1992, 63 (03) :751-759
[6]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[7]  
Borg I., 2005, MODERN MULTIDIMENSIO
[8]   Molecular networks: The top-down view [J].
Bray, D .
SCIENCE, 2003, 301 (5641) :1864-1865
[9]   Riboswitches and the RNA World [J].
Breaker, Ronald R. .
COLD SPRING HARBOR PERSPECTIVES IN BIOLOGY, 2012, 4 (02)
[10]  
Brouwer AE, 2012, UNIVERSITEXT, P1, DOI 10.1007/978-1-4614-1939-6