Intelligent and robust computational prediction model for DNA N4-methylcytosine sites via natural language processing

被引:6
作者
Tahir, Muhammd [1 ]
Tayara, Hilal [2 ]
Hayat, Maqsood [1 ]
Chong, Kil To [3 ,4 ]
机构
[1] Abdul Wali Khan Univ, Dept Comp Sci, Mardan 23200, KP, Pakistan
[2] Jeonbuk Natl Univ, Sch Int Engn & Sci, Jeonju 54896, South Korea
[3] Jeonbuk Natl Univ, Dept Elect & Informat Engn, Jeonju 54896, South Korea
[4] Jeonbuk Natl Univ, Adv Elect & Informat Res Ctr, Jeonju 54896, South Korea
基金
新加坡国家研究基金会;
关键词
DNA; Natural language processing; Convolution neural network; word2vec; Methylcytosine; N4-METHYLCYTOSINE; N6-METHYLADENINE; IDENTIFICATION; METHYLATION;
D O I
10.1016/j.chemolab.2021.104391
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
DNAN(4)-methylcytosine (4 mC) is an essential epigenetic modification and performs crucial roles in restriction-modification systems. The 4 mC involves many essential cellular processes, namely: correcting DNA replication and controlling DNA replication errors in the prokaryotic organism. In order to understand their biological functional mechanisms, the prediction of 4 mC modification is indispensable. Although computationally, it was targeted but the desired outcomes were not obtained. Thus, the development of an intelligent computational prediction system that truly expresses 4 mC modification sites is imperative. An efficient and high throughput discriminative intelligent computational system called "iDNA-4mC-DL" is introduced using the natural language processing method "word2vec" along with a convolution neural network. The obtained outcomes authenticated that the proposed iDNA-4mC-DL system performs outstandingly on six publicly available benchmark and independent datasets compared to current tools. It is, thus, highly estimated that the proposed model might be a more supportive and applied tool for rudimentary research and academia.
引用
收藏
页数:6
相关论文
共 32 条
[1]   Identification of Functional piRNAs Using a Convolutional Neural Network [J].
Ali, Syed Danish ;
Alam, Waleed ;
Tayara, Hilal ;
Chong, Kil To .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (03) :1661-1669
[2]   iDNA4mC: identifying DNA N4-methylcytosine sites based on nucleotide chemical properties [J].
Chen, Wei ;
Yang, Hui ;
Feng, Pengmian ;
Ding, Hui ;
Lin, Hao .
BIOINFORMATICS, 2017, 33 (22) :3518-3523
[3]   DNA MODIFICATION BY METHYLTRANSFERASES [J].
CHENG, XD .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 1995, 5 (01) :4-10
[4]   Prediction of protein cellular attributes using pseudo-amino acid composition [J].
Chou, KC .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 2001, 43 (03) :246-255
[5]   N4-METHYLCYTOSINE AS A MINOR BASE IN BACTERIAL-DNA [J].
EHRLICH, M ;
WILSON, GG ;
KUO, KC ;
GEHRKE, CW .
JOURNAL OF BACTERIOLOGY, 1987, 169 (03) :939-943
[6]  
Flusberg BA, 2010, NAT METHODS, V7, P461, DOI [10.1038/NMETH.1459, 10.1038/nmeth.1459]
[7]   4mCPred: machine learning methods for DNA N4-methylcytosine sites prediction [J].
He, Wenying ;
Jia, Cangzhi ;
Zou, Quan .
BIOINFORMATICS, 2019, 35 (04) :593-601
[8]   4mCCNN: Identification of N4-Methylcytosine Sites in Prokaryotes Using Convolutional Neural Network [J].
Khanal, Jhabindra ;
Nazari, Iman ;
Tayara, Hilal ;
Chong, Kil To .
IEEE ACCESS, 2019, 7 :145455-145461
[9]   A Convolutional Neural Network Using Dinucleotide One-hot Encoder for identifying DNA N6-Methyladenine Sites in the Rice Genome [J].
Lv, Zhibin ;
Ding, Hui ;
Wang, Lei ;
Zou, Quan .
NEUROCOMPUTING, 2021, 422 :214-221
[10]   Meta-4mCpred: A Sequence-Based Meta-Predictor for Accurate DNA 4mC Site Prediction Using Effective Feature Representation [J].
Manavalan, Balachandran ;
Basith, Shaherin ;
Shin, Tae Hwan ;
Wei, Leyi ;
Lee, Gwang .
MOLECULAR THERAPY-NUCLEIC ACIDS, 2019, 16 :733-744