Predicting RNA 5-Methylcytosine Sites by Using Essential Sequence Features and Distributions

被引:41
作者
Chen, Lei [1 ,2 ]
Li, ZhanDong [3 ]
Zhang, ShiQi [4 ]
Zhang, Yu-Hang [5 ]
Huang, Tao [6 ,7 ]
Cai, Yu-Dong [1 ]
机构
[1] Shanghai Univ, Sch Life Sci, Shanghai 200444, Peoples R China
[2] Shanghai Maritime Univ, Coll Informat Engn, Shanghai 201306, Peoples R China
[3] Jilin Engn Normal Univ, Coll Food Engn, Changchun, Peoples R China
[4] Univ Copenhagen, Dept Biostat, DK-2099 Copenhagen, Denmark
[5] Harvard Med Sch, Brigham & Womens Hosp, Channing Div Network Med, Boston, MA 02115 USA
[6] Chinese Acad Sci, Univ Chinese Acad Sci, Shanghai Inst Nutr & Hlth, Biomed Big Data Ctr,CAS Key Lab Computat Biol, Shanghai 200031, Peoples R China
[7] Chinese Acad Sci, Univ Chinese Acad Sci, Shanghai Inst Nutr & Hlth, CAS Key Lab Tissue Microenvironm & Tumor, Shanghai 200031, Peoples R China
基金
国家重点研发计划;
关键词
METHYLATION; M(6)A; IDENTIFICATION; LANDSCAPE;
D O I
10.1155/2022/4035462
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Methylation is one of the most common and considerable modifications in biological systems mediated by multiple enzymes. Recent studies have shown that methylation has been widely identified in different RNA molecules. RNA methylation modifications have various kinds, such as 5-methylcytosine (m(5)C). However, for individual methylation sites, their functions still remain to be elucidated. Testing of all methylation sites relies heavily on high-throughput sequencing technology, which is expensive and labor consuming. Thus, computational prediction approaches could serve as a substitute. In this study, multiple machine learning models were used to predict possible RNA m(5)C sites on the basis of mRNA sequences in human and mouse. Each site was represented by several features derived from k-mers of an RNA subsequence containing such site as center. The powerful max-relevance and min-redundancy (mRMR) feature selection method was employed to analyse these features. The outcome feature list was fed into incremental feature selection method, incorporating four classification algorithms, to build efficient models. Furthermore, the sites related to features used in the models were also investigated.
引用
收藏
页数:11
相关论文
共 62 条
[1]   Eukaryotic 5-methylcytosine (m5C) RNA Methyltransferases: Mechanisms, Cellular Functions, and Links to Disease [J].
Bohnsack, Katherine E. ;
Hoebartner, Claudia ;
Bohnsack, Markus T. .
GENES, 2019, 10 (02)
[2]   Eukaryotic rRNA Modification by Yeast 5-Methylcytosine-Methyltransferases and Human Proliferation-Associated Antigen p120 [J].
Bourgeois, Gabrielle ;
Ney, Michel ;
Gaspar, Imre ;
Aigueperse, Christelle ;
Schaefer, Matthias ;
Kellner, Stefanie ;
Helm, Mark ;
Motorin, Yuri .
PLOS ONE, 2015, 10 (07)
[3]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[4]   Identifying Methylation Pattern and Genes Associated with Breast Cancer Subtypes [J].
Chen, Lei ;
Zeng, Tao ;
Pan, Xiaoyong ;
Zhang, Yu-Hang ;
Huang, Tao ;
Cai, Yu-Dong .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2019, 20 (17)
[5]   Identify Key Sequence Features to Improve CRISPR sgRNA Efficacy [J].
Chen, Lei ;
Wang, Shaopeng ;
Zhang, Yu-Hang ;
Li, Jiarui ;
Xing, Zhi-Hao ;
Yang, Jialiang ;
Huang, Tao ;
Cai, Yu-Dong .
IEEE ACCESS, 2017, 5 :26582-26590
[6]   iMPT-FDNPL: Identification of Membrane Protein Types with Functional Domains and a Natural Language Processing Approach [J].
Chen, Wei ;
Chen, Lei ;
Dai, Qi .
COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2021, 2021
[7]   Epitranscriptomic regulation by m6A RNA methylation in brain development and diseases [J].
Chokkalla, Anil K. ;
Mehta, Suresh L. ;
Vemuganti, Raghu .
JOURNAL OF CEREBRAL BLOOD FLOW AND METABOLISM, 2020, 40 (12) :2331-2349
[8]   SUPPORT-VECTOR NETWORKS [J].
CORTES, C ;
VAPNIK, V .
MACHINE LEARNING, 1995, 20 (03) :273-297
[9]   NEAREST NEIGHBOR PATTERN CLASSIFICATION [J].
COVER, TM ;
HART, PE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) :21-+
[10]   Computational identification of N6-methyladenosine sites in multiple tissues of mammals [J].
Dao, Fu-Ying ;
Lv, Hao ;
Yang, Yu-He ;
Zulfiqar, Hasan ;
Gao, Hui ;
Lin, Hao .
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2020, 18 :1084-1091