Discriminating cirRNAs from other lncRNAs using a hierarchical extreme learning machine (H-ELM) algorithm with feature selection

被引:58
作者
Chen, Lei [1 ,2 ]
Zhang, Yu-Hang [3 ]
Huang, Guohua [4 ]
Pan, Xiaoyong [5 ]
Wang, ShaoPeng [1 ]
Huang, Tao [3 ]
Cai, Yu-Dong [1 ]
机构
[1] Shanghai Univ, Coll Life Sci, Shanghai 200444, Peoples R China
[2] Shanghai Maritime Univ, Coll Informat Engn, Shanghai 201306, Peoples R China
[3] Chinese Acad Sci, Shanghai Inst Biol Sci, Inst Hlth Sci, Shanghai 200031, Peoples R China
[4] Shaoyang Univ, Dept Math, Shaoyang 422000, Hunan, Peoples R China
[5] Erasmus MC, Dept Med Informat, Rotterdam, Netherlands
基金
湖南省自然科学基金; 上海市自然科学基金; 中国国家自然科学基金;
关键词
cirRNAs; lncRNAs; Minimum redundancy maximum relevance; Hierarchical extreme learning machine algorithm; LONG NONCODING RNA; SUPPORT VECTOR MACHINE; TRANSPOSABLE ELEMENTS; CIRCULAR RNA; GENE-EXPRESSION; PROTEIN INTERACTIONS; MOLECULAR FRAGMENTS; MINIMUM REDUNDANCY; MAXIMUM RELEVANCE; METABOLIC PATHWAY;
D O I
10.1007/s00438-017-1372-7
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
As non-coding RNAs, circular RNAs (cirRNAs) and long non-coding RNAs (lncRNAs) have attracted an increasing amount of attention. They have been confirmed to participate in many biological processes, including playing roles in transcriptional regulation, regulating protein-coding genes, and binding to RNA-associated proteins. Until now, the differences between these two types of non-coding RNAs have not been fully uncovered. It is still quite difficult to detect cirRNAs from other lncRNAs using simple techniques. In this study, we investigated these two types of non-coding RNAs using several computational methods. The purpose was to extract important factors that could distinguish cirRNAs from other lncRNAs and build an effective classification model to distinguish them. First, we collected cirRNAs, lncRNAs and their representations from a previous study, in which each cirRNA or lncRNA was represented by 188 features derived from its graph representation, sequence and conservation properties. Second, these features were analyzed by the minimum redundancy maximum relevance (mRMR) method. The obtained mRMR feature list, incremental feature selection method and hierarchical extreme learning machine algorithm were employed to build an optimal classification model with sensitivity of 0.703, specificity of 0.850, accuracy of 0.789 and a Matthews correlation coefficient of 0.561. Finally, we analyzed the 16 most important features. Of them, the sequences and structures of the RNA molecule were top ranking, implying they can be potential indicators of differences between cirRNAs and other lncRNAs. Meanwhile, other features of evolutionary conversation, sequence consecution were also important.
引用
收藏
页码:137 / 149
页数:13
相关论文
共 107 条
[1]   SPINGO: a rapid species-classifier for microbial amplicon sequences [J].
Allard, Guy ;
Ryan, Feargal J. ;
Jeffery, Ian B. ;
Claesson, Marcus J. .
BMC BIOINFORMATICS, 2015, 16
[2]  
[Anonymous], J SENS
[3]   AU-rich elements and associated factors: are there unifying principles? [J].
Barreau, C ;
Paillard, L ;
Osborne, HB .
NUCLEIC ACIDS RESEARCH, 2005, 33 (22) :7138-7150
[4]   A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems [J].
Beck, Amir ;
Teboulle, Marc .
SIAM JOURNAL ON IMAGING SCIENCES, 2009, 2 (01) :183-202
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]   The long noncoding RNA TUG1 regulates blood-tumor barrier permeability by targeting miR-144 [J].
Cai, Heng ;
Xue, Yixue ;
Wang, Ping ;
Wang, Zhenhua ;
Li, Zhen ;
Hu, Yi ;
Li, Zhiqing ;
Shang, Xiuli ;
Liu, Yunhui .
ONCOTARGET, 2015, 6 (23) :19759-19779
[7]   Classification of lung cancer using ensemble-based feature selection and machine learning methods [J].
Cai, Zhihua ;
Xu, Dong ;
Zhang, Qing ;
Zhang, Jiexia ;
Ngai, Sai-Ming ;
Shao, Jianlin .
MOLECULAR BIOSYSTEMS, 2015, 11 (03) :791-800
[8]   Protein Sequence Classification with Improved Extreme Learning Machine Algorithms [J].
Cao, Jiuwen ;
Xiong, Lianglin .
BIOMED RESEARCH INTERNATIONAL, 2014, 2014
[9]   A TUMOR-ASSOCIATED FIBRONECTIN ISOFORM GENERATED BY ALTERNATIVE SPLICING OF MESSENGER-RNA PRECURSORS [J].
CARNEMOLLA, B ;
BALZA, E ;
SIRI, A ;
ZARDI, L ;
NICOTRA, MR ;
BIGOTTI, A ;
NATALI, PG .
JOURNAL OF CELL BIOLOGY, 1989, 108 (03) :1139-1148
[10]   Biogenesis, identification, and function of exonic circular RNAs [J].
Chen, Iju ;
Chen, Chia-Ying ;
Chuang, Trees-Juen .
WILEY INTERDISCIPLINARY REVIEWS-RNA, 2015, 6 (05) :563-579