Attention-based multi-label neural networks for integrated prediction and interpretation of twelve widely occurring RNA modifications

被引:88
作者
Song, Zitao [1 ]
Huang, Daiyun [2 ,3 ]
Song, Bowen [1 ,4 ]
Chen, Kunqi [5 ]
Song, Yiyou [2 ]
Liu, Gang [1 ]
Su, Jionglong [6 ]
de Magalhaes, Joao Pedro [7 ]
Rigden, Daniel J. [4 ]
Meng, Jia [2 ,4 ,8 ]
机构
[1] Xian Jiaotong Liverpool Univ, Dept Math Sci, Suzhou, Peoples R China
[2] Xian Jiaotong Liverpool Univ, Dept Biol Sci, Suzhou, Peoples R China
[3] Univ Liverpool, Dept Comp Sci, Liverpool, Merseyside, England
[4] Univ Liverpool, Inst Syst Mol & Integrat Biol, Liverpool, Merseyside, England
[5] Fujian Med Univ, Sch Basic Med Sci, Lab Minist Educ Gastrointestinal Canc, Fuzhou, Peoples R China
[6] Xian Jiaotong Liverpool Univ, XJTLU Entrepreneur Coll Taicang, Sch AI & Adv Comp, Suzhou, Peoples R China
[7] Univ Liverpool, Inst Ageing & Chron Dis, Liverpool, Merseyside, England
[8] Xian Jiaotong Liverpool Univ, AI Univ Res Ctr, Suzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
WEB SERVER; SITES; PSEUDOURIDINE; NUCLEOTIDE; DATABASE; TISSUES;
D O I
10.1038/s41467-021-24313-3
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recent studies suggest that epi-transcriptome regulation via post-transcriptional RNA modifications is vital for all RNA types. Precise identification of RNA modification sites is essential for understanding the functions and regulatory mechanisms of RNAs. Here, we present MultiRM, a method for the integrated prediction and interpretation of post-transcriptional RNA modifications from RNA sequences. Built upon an attention-based multi-label deep learning framework, MultiRM not only simultaneously predicts the putative sites of twelve widely occurring transcriptome modifications (m(6)A, m(1)A, m(5)C, m(5)U, m(6)Am, m(7)G, Psi, I, Am, Cm, Gm, and Um), but also returns the key sequence contents that contribute most to the positive predictions. Importantly, our model revealed a strong association among different types of RNA modifications from the perspective of their associated sequence contexts. Our work provides a solution for detecting multiple RNA modifications, enabling an integrated analysis of these RNA modifications, and gaining a better understanding of sequence-based RNA modification mechanisms. RNA modifications appear to play a role in determining RNA structure and function. Here, the authors develop a deep learning model that predicts the location of 12 RNA modifications using primary sequence, and show that several modifications are associated, which suggests dependencies between them.
引用
收藏
页数:11
相关论文
共 70 条
[11]   FICC-Seq: a method for enzyme-specified profiling of methyl-5-uridine in cellular RNA [J].
Carter, Jean-Michel ;
Emmett, Warren ;
Mozos, Igor R. D. L. ;
Kotter, Annika ;
Helm, Mark ;
Ule, Jernej ;
Hussain, Shobbir .
NUCLEIC ACIDS RESEARCH, 2019, 47 (19) :E113-+
[12]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[13]   Cross-talk of four types of RNA modification writers defines tumor microenvironment and pharmacogenomic landscape in colorectal cancer [J].
Chen, Huifang ;
Yao, Jiameng ;
Bao, Rujuan ;
Dong, Yu ;
Zhang, Ting ;
Du, Yanhua ;
Wang, Gaoyang ;
Ni, Duan ;
Xun, Zhenzhen ;
Niu, Xiaoyin ;
Ye, Youqiong ;
Li, Hua-Bing .
MOLECULAR CANCER, 2021, 20 (01)
[14]   RMDisease: a database of genetic variants that affect RNA modifications, with implications for epitranscriptome pathogenesis [J].
Chen, Kunqi ;
Song, Bowen ;
Tang, Yujiao ;
Wei, Zhen ;
Xu, Qingru ;
Su, Jionglong ;
de Magalhaes, Joao Pedro ;
Rigden, Daniel J. ;
Meng, Jia .
NUCLEIC ACIDS RESEARCH, 2021, 49 (D1) :D1396-D1404
[15]   WHISTLE: a high-accuracy map of the human N6-methyladenosine (m6A) epitranscriptome predicted using a machine learning approach [J].
Chen, Kunqi ;
Wei, Zhen ;
Zhang, Qing ;
Wu, Xiangyu ;
Rong, Rong ;
Lu, Zhiliang ;
Su, Jionglong ;
de Magalhaes, Joao Pedro ;
Rigden, Daniel J. ;
Meng, Jia .
NUCLEIC ACIDS RESEARCH, 2019, 47 (07)
[16]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[17]   iRNA-m2G: Identifying N2-methylguanosine Sites Based on Sequence-Derived Information [J].
Chen, Wei ;
Song, Xiaoming ;
Lv, Hao ;
Lin, Hao .
MOLECULAR THERAPY-NUCLEIC ACIDS, 2019, 18 :253-258
[18]   iRNA-m7G: Identifying N7-methylguanosine Sites by Fusing Multiple Features [J].
Chen, Wei ;
Feng, Pengmian ;
Song, Xiaoming ;
Lv, Hao ;
Lin, Hao .
MOLECULAR THERAPY-NUCLEIC ACIDS, 2019, 18 :269-274
[19]   iRNA(m6A)-PseDNC: Identifying N6-methyladenosine sites using pseudo dinucleotide composition [J].
Chen, Wei ;
Ding, Hui ;
Zhou, Xu ;
Lin, Hao ;
Chou, Kuo-Chen .
ANALYTICAL BIOCHEMISTRY, 2018, 561 :59-65
[20]   iRNA-PseU: Identifying RNA pseudouridine sites [J].
Chen, Wei ;
Tang, Hua ;
Ye, Jing ;
Lin, Hao ;
Chou, Kuo-Chen .
MOLECULAR THERAPY-NUCLEIC ACIDS, 2016, 5 :e332