iDNA6mA-PseKNC: Identifying DNA N6-methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC

被引:249
作者
Feng, Pengmian [1 ]
Yang, Hui [2 ]
Ding, Hui [2 ]
Lin, Hao [2 ,5 ]
Chen, Wei [3 ,4 ,5 ]
Chou, Kuo-Chen [2 ,5 ]
机构
[1] North China Univ Sci & Technol, Hebei Prov Key Lab Occupat Hlth & Safety Coal Ind, Sch Publ Hlth, Tangshan 063000, Peoples R China
[2] Univ Elect Sci & Technol China, Key Lab Neuroinformat, Sch Life Sci & Technol, Minist Educ,Ctr Informat Biol, Chengdu 610054, Sichuan, Peoples R China
[3] North China Univ Sci & Technol, Dept Phys, Sch Sci, Tangshan 063000, Peoples R China
[4] North China Univ Sci & Technol, Ctr Genom & Computat Biol, Tangshan 063000, Peoples R China
[5] Gordon Life Sci Inst, Boston, MA 02478 USA
基金
中国博士后科学基金;
关键词
PTMs; N-6-methyladenine; Nucleotide physicochemical properties; General PseKNC; Lingering density; Intuitive metrics; AMINO-ACID-COMPOSITION; SEQUENCE-BASED PREDICTOR; MEMBRANE-PROTEIN TYPES; LYSINE SUCCINYLATION SITES; LABEL LEARNING CLASSIFIER; SUPPORT VECTOR MACHINES; SUBCELLULAR-LOCALIZATION; K-TUPLE; RECOMBINATION SPOTS; STRUCTURAL CLASS;
D O I
10.1016/j.ygeno.2018.01.005
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
N-6-methyladenine (6mA) is one kind of post-replication modification (PTM or PTRM) occurring in a wide range of DNA sequences. Accurate identification of its sites will be very helpful for revealing the biological functions of 6mA, but it is time-consuming and expensive to determine them by experiments alone. Unfortunately, so far, no bioinformatics tool is available to do so. To fill in such an empty area, we have proposed a novel predictor called iDNA6mA-PseKNC that is established by incorporating nucleotide physicochemical properties into Pseudo K-tuple Nucleotide Composition (PseKNC). It has been observed via rigorous cross-validations that the predictor's sensitivity (Sn), specificity (Sp), accuracy (Acc), and stability (MCC) are 93%, 100%, 96%, and 0.93, respectively. For the convenience of most experimental scientists, a user-friendly web server for iDNA6mA-PseKNC has been established at http://lin-group.cn/server/iDNA6mA-PseKNC, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved.
引用
收藏
页码:96 / 102
页数:7
相关论文
共 154 条
[71]   DNA Methylation on N6-Adenine in C-elegans [J].
Greer, Eric Lieberman ;
Blanco, Mario Andres ;
Gu, Lei ;
Sendinc, Erdem ;
Liu, Jianzhao ;
Aristizabal-Corrales, David ;
Hsu, Chih-Hung ;
Aravind, L. ;
He, Chuan ;
Shi, Yang .
CELL, 2015, 161 (04) :868-878
[72]   iNuc-PseKNC: a sequence-based predictor for predicting nucleosome positioning in genomes with pseudo k-tuple nucleotide composition [J].
Guo, Shou-Hui ;
Deng, En-Ze ;
Xu, Li-Qin ;
Ding, Hui ;
Lin, Hao ;
Chen, Wei ;
Chou, Kuo-Chen .
BIOINFORMATICS, 2014, 30 (11) :1522-1529
[73]   Discriminating Outer Membrane Proteins with Fuzzy K-Nearest Neighbor Algorithms Based on the General Form of Chou's PseAAC [J].
Hayat, Maqsood ;
Khan, Asifullah .
PROTEIN AND PEPTIDE LETTERS, 2012, 19 (04) :411-421
[74]   Prediction of presynaptic and postsynaptic neurotoxins by combining various Chou's pseudo components [J].
Huo, Haiyan ;
Li, Tao ;
Wang, Shiyuan ;
Lv, Yingli ;
Zuo, Yongchun ;
Yang, Lei .
SCIENTIFIC REPORTS, 2017, 7
[75]   Natural History of Eukaryotic DNA Methylation Systems [J].
Iyer, Lakshminarayan M. ;
Abhiman, Saraswathi ;
Aravind, L. .
MODIFICATIONS OF NUCLEAR DNA AND ITS REGULATORY PROTEINS, 2011, 101 :25-104
[76]   pSumo-CD: predicting sumoylation sites in proteins with covariance discriminant algorithm by incorporating sequence-coupled effects into general PseAAC [J].
Jia, Jianhua ;
Zhang, Liuxia ;
Liu, Zi ;
Xiao, Xuan ;
Chou, Kuo-Chen .
BIOINFORMATICS, 2016, 32 (20) :3133-3141
[77]   iCar-PseCp: identify carbonylation sites in proteins by Monto Carlo sampling and incorporating sequence coupled effects into general PseAAC [J].
Jia, Jianhua ;
Liu, Zi ;
Xiao, Xuan ;
Liu, Bingxiang ;
Chou, Kuo-Chen .
ONCOTARGET, 2016, 7 (23) :34558-34570
[78]   pSuc-Lys: Predict lysine succinylation sites in proteins with PseAAC and ensemble random forest approach [J].
Jia, Jianhua ;
Liu, Zi ;
Xiao, Xuan ;
Liu, Bingxiang ;
Chou, Kuo-Chen .
JOURNAL OF THEORETICAL BIOLOGY, 2016, 394 :223-230
[79]   iSuc-PseOpt: Identifying lysine succinylation sites in proteins by incorporating sequence-coupling effects into pseudo components and optimizing imbalanced training dataset [J].
Jia, Jianhua ;
Liu, Zi ;
Xiao, Xuan ;
Liu, Bingxiang ;
Chou, Kuo-Chen .
ANALYTICAL BIOCHEMISTRY, 2016, 497 :48-56
[80]   Identification of protein-protein binding sites by incorporating the physicochemical properties and stationary wavelet transforms into pseudo amino acid composition [J].
Jia, Jianhua ;
Liu, Zi ;
Xiao, Xuan ;
Liu, Bingxiang ;
Chou, Kuo-Chen .
JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS, 2016, 34 (09) :1946-1961