iROS-gPseKNC: Predicting replication origin sites in DNA by incorporating dinucleotide position-specific propensity into general pseudo nucleotide composition

被引:119
作者
Xiao, Xuan [1 ,2 ,5 ]
Ye, Han-Xiao [1 ]
Liu, Zi [3 ]
Jia, Jian-Hua [1 ]
Chou, Kuo-Chen [4 ,5 ]
机构
[1] Jing De Zhen Ceram Inst, Dept Comp, Jing De Zhen 333403, Peoples R China
[2] ZheJiang Text & Fash Coll, Informat Sch, Ningbo 315211, Peoples R China
[3] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Jiangsu, Peoples R China
[4] King Abdulaziz Univ, CEGMR, Jeddah 21589, Saudi Arabia
[5] Gordon Life Sci Inst, Boston, MA 02478 USA
关键词
origin of replication; position-specific dinucleotide propensity; general pseudo nucleotide composition; random forest; iROS-gPseKNC; AMINO-ACID-COMPOSITION; PROTEIN SUBCELLULAR LOCATION; SEQUENCE-BASED PREDICTOR; IDENTIFY RECOMBINATION SPOTS; LYSINE SUCCINYLATION SITES; FUSING FUNCTIONAL DOMAIN; MULTI-LABEL CLASSIFIER; WEB-SERVER; PHYSICOCHEMICAL PROPERTIES; ENSEMBLE CLASSIFIER;
D O I
10.18632/oncotarget.9057
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
DNA replication, occurring in all living organisms and being the basis for biological inheritance, is the process of producing two identical replicas from one original DNA molecule. To in-depth understand such an important biological process and use it for developing new strategy against genetics diseases, the knowledge of duplication origin sites in DNA is indispensible. With the explosive growth of DNA sequences emerging in the postgenomic age, it is highly desired to develop high throughput tools to identify these regions purely based on the sequence information alone. In this paper, by incorporating the dinucleotide position-specific propensity information into the general pseudo nucleotide composition and using the random forest classifier, a new predictor called iROS-gPseKNC was proposed. Rigorously cross-validations have indicated that the proposed predictor is significantly better than the best existing method in sensitivity, specificity, overall accuracy, and stability. Furthermore, a user-friendly web-server for iROS-gPseKNC has been established at http://www.jci-bioinfo.cn/iROS-gPseKNC, by which users can easily get their desired results without the need to bother the complicated mathematics, which were presented just for the integrity of the methodology itself.
引用
收藏
页码:34180 / 34189
页数:10
相关论文
共 96 条
[41]   Graphic Rule for Drug Metabolism Systems [J].
Chou, Kuo-Chen .
CURRENT DRUG METABOLISM, 2010, 11 (04) :369-378
[42]   Pseudo Amino Acid Composition and its Applications in Bioinformatics, Proteomics and System Biology [J].
Chou, Kuo-Chen .
CURRENT PROTEOMICS, 2009, 6 (04) :262-274
[43]  
Chou Kuo-Chen., 2011, Natural Science, V3, P862, DOI [DOI 10.4236/NS.2011.310111, 10.4236/ns.2011.310111]
[44]  
Davis J., 2006, ICML 06, DOI 10.1145/1143844.1143874
[45]   iCTX-Type: A Sequence-Based Predictor for Identifying the Types of Conotoxins in Targeting Ion Channels [J].
Ding, Hui ;
Deng, En-Ze ;
Yuan, Lu-Feng ;
Liu, Li ;
Lin, Hao ;
Chen, Wei ;
Chou, Kuo-Chen .
BIOMED RESEARCH INTERNATIONAL, 2014, 2014
[46]   PseAAC-General: Fast Building Various Modes of General Form of Chou's Pseudo-Amino Acid Composition for Large-Scale Protein Datasets [J].
Du, Pufeng ;
Gu, Shuwang ;
Jiao, Yasen .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2014, 15 (03) :3495-3506
[47]   iNR-Drug: Predicting the Interaction of Drugs with Nuclear Receptors in Cellular Networking [J].
Fan, Yue-Nong ;
Xiao, Xuan ;
Min, Jian-Liang ;
Chou, Kuo-Chen .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2014, 15 (03) :4915-4937
[48]   An introduction to ROC analysis [J].
Fawcett, Tom .
PATTERN RECOGNITION LETTERS, 2006, 27 (08) :861-874
[49]   iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition [J].
Feng, Peng-Mian ;
Chen, Wei ;
Lin, Hao ;
Chou, Kuo-Chen .
ANALYTICAL BIOCHEMISTRY, 2013, 442 (01) :118-125
[50]   iNuc-PseKNC: a sequence-based predictor for predicting nucleosome positioning in genomes with pseudo k-tuple nucleotide composition [J].
Guo, Shou-Hui ;
Deng, En-Ze ;
Xu, Li-Qin ;
Ding, Hui ;
Lin, Hao ;
Chen, Wei ;
Chou, Kuo-Chen .
BIOINFORMATICS, 2014, 30 (11) :1522-1529