DotAligner: identification and clustering of RNA structure motifs

被引:12
作者
Smith, Martin A. [1 ,2 ]
Seemann, Stefan E. [3 ,4 ]
Quek, Xiu Cheng [1 ,2 ]
Mattick, John S. [1 ,2 ]
机构
[1] Garvan Inst Med Res, RNA Biol & Plast Grp, 384 Victoria St, Sydney, NSW 2010, Australia
[2] UNSW Australia, St Vincents Clin Sch, Fac Med, Sydney, NSW 2010, Australia
[3] Univ Copenhagen, Ctr Noncoding RNA Technol & Hlth RTH, Groennegaardsvej 3, DK-1870 Frederiksberg, Denmark
[4] Univ Copenhagen, Fac Hlth & Med Sci, Dept Vet & Anim Sci, DK-1870 Frederiksberg, Denmark
关键词
Functions of RNA structures; RNA structure clustering; Machine learning; RNA-protein interactions; Functional genome annotation; Regulation by non-coding RNAs; LONG NONCODING RNAS; PARTITION-FUNCTION; ALIGNMENT; TRANSCRIPTOMES; PREDICTION; FAMILIES; SCAFFOLD;
D O I
10.1186/s13059-017-1371-3
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
The diversity of processed transcripts in eukaryotic genomes poses a challenge for the classification of their biological functions. Sparse sequence conservation in non-coding sequences and the unreliable nature of RNA structure predictions further exacerbate this conundrum. Here, we describe a computational method, DotAligner, for the unsupervised discovery and classification of homologous RNA structure motifs from a set of sequences of interest. Our approach outperforms comparable algorithms at clustering known RNA structure families, both in speed and accuracy. It identifies clusters of known and novel structure motifs from ENCODE immunoprecipitation data for 44 RNA-binding proteins.
引用
收藏
页数:12
相关论文
共 44 条
[1]  
Ankerst M, 1999, SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999, P49
[2]   The ins and outs of lncRNA structure: How, why and what comes next? [J].
Blythe, Amanda J. ;
Fox, Archa H. ;
Bond, Charles S. .
BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS, 2016, 1859 (01) :46-58
[3]   Architectural RNAs (arcRNAs): A class of long noncoding RNAs that function as the scaffold of nuclear bodies [J].
Chujo, Takeshi ;
Yamazaki, Tomohiro ;
Hirose, Tetsuro .
BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS, 2016, 1859 (01) :139-146
[4]  
Dal Palù A, 2010, LECT NOTES COMPUT SC, V6308, P167, DOI 10.1007/978-3-642-15396-9_16
[5]   STRAL: progressive alignment of non-coding RNA using base pairing probability vectors in quadratic time [J].
Dalli, Deniz ;
Wilm, Andreas ;
Mainz, Indra ;
Steger, Gerhard .
BIOINFORMATICS, 2006, 22 (13) :1593-1599
[6]   Computational Analysis of Conserved RNA Secondary Structure in Transcriptomes and Genomes [J].
Eddy, Sean R. .
ANNUAL REVIEW OF BIOPHYSICS, VOL 43, 2014, 43 :433-456
[7]   Long non-coding RNAs: spatial amplifiers that control nuclear structure and gene expression [J].
Engreitz, Jesse M. ;
Ollikainen, Noah ;
Guttman, Mitchell .
NATURE REVIEWS MOLECULAR CELL BIOLOGY, 2016, 17 (12) :756-770
[8]  
Ester M., 1996, KDD-96 Proceedings. Second International Conference on Knowledge Discovery and Data Mining, P226
[9]   A benchmark of multiple sequence alignment programs upon structural RNAs [J].
Gardner, PP ;
Wilm, A ;
Washietl, S .
NUCLEIC ACIDS RESEARCH, 2005, 33 (08) :2433-2439
[10]   lncRNAs transactivate STAU1-mediated mRNA decay by duplexing with 3′ UTRs via Alu elements [J].
Gong, Chenguang ;
Maquat, Lynne E. .
NATURE, 2011, 470 (7333) :284-+