Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection technique

被引:174
作者
Dao, Fu-Ying [1 ,2 ]
Lv, Hao [1 ,2 ]
Wang, Fang [1 ,2 ]
Feng, Chao-Qin [1 ,2 ]
Ding, Hui [1 ,2 ]
Chen, Wei [1 ,2 ,3 ]
Lin, Hao [1 ,2 ]
机构
[1] Univ Elect Sci & Technol China, Sch Life Sci & Technol, Minist Educ, Key Lab NeuroInformat, Chengdu 610054, Sichuan, Peoples R China
[2] Univ Elect Sci & Technol China, Ctr Informat Biol, Chengdu 610054, Sichuan, Peoples R China
[3] Chengdu Univ Tradit Chinese Med, Innovat Inst Chinese Med & Pharm, Chengdu 611730, Sichuan, Peoples R China
关键词
SEQUENCE-BASED PREDICTOR; UPDATED RESOURCE; WEB SERVER; DNA; YEAST; SITES; RNA; RECOGNITION; INITIATION; PROTEINS;
D O I
10.1093/bioinformatics/bty943
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation DNA replication is a key step to maintain the continuity of genetic information between parental generation and offspring. The initiation site of DNA replication, also called origin of replication (ORI), plays an extremely important role in the basic biochemical process. Thus, rapidly and effectively identifying the location of ORI in genome will provide key clues for genome analysis. Although biochemical experiments could provide detailed information for ORI, it requires high experimental cost and long experimental period. As good complements to experimental techniques, computational methods could overcome these disadvantages. Results Thus, in this study, we developed a predictor called iORI-PseKNC2.0 to identify ORIs in the Saccharomyces cerevisiae genome based on sequence information. The PseKNC including 90 physicochemical properties was proposed to formulate ORI and non-ORI samples. In order to improve the accuracy, a two-step feature selection was proposed to exclude redundant and noise information. As a result, the overall success rate of 88.53% was achieved in the 5-fold cross-validation test by using support vector machine. Availability and implementation Based on the proposed model, a user-friendly webserver was established and can be freely accessed at http://lin-group.cn/server/iORI-PseKNC2.0. The webserver will provide more convenience to most of wet-experimental scholars.
引用
收藏
页码:2075 / 2083
页数:9
相关论文
共 81 条
[51]   Genome-wide identification of replication origins in yeast by comparative genomics [J].
Nieduszynski, Conrad A. ;
Knox, Yvonne ;
Donaldson, Anne D. .
GENES & DEVELOPMENT, 2006, 20 (14) :1874-1879
[52]   Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy [J].
Peng, HC ;
Long, FH ;
Ding, C .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005, 27 (08) :1226-1238
[53]   Nuclear gyrB encodes a functional subunit of the Plasmodium falciparum gyrase that is involved in apicoplast DNA replication [J].
Ram, E. V. S. Raghu ;
Kumar, Ambrish ;
Biswas, Subir ;
Kumar, Ashutosh ;
Chaubey, Sushma ;
Siddiqi, Mohammad Imran ;
Habib, Saman .
MOLECULAR AND BIOCHEMICAL PARASITOLOGY, 2007, 154 (01) :30-39
[54]   THE ORIGIN RECOGNITION COMPLEX INTERACTS WITH A BIPARTITE DNA-BINDING SITE WITHIN YEAST REPLICATORS [J].
RAO, H ;
STILLMAN, B .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1995, 92 (06) :2224-2228
[55]   INITIATION COMPLEX ASSEMBLY AT BUDDING YEAST REPLICATION ORIGINS BEGINS WITH THE RECOGNITION OF A BIPARTITE SEQUENCE BY LIMITING AMOUNTS OF THE INITIATOR, ORC [J].
ROWLEY, A ;
COCKER, JH ;
HARWOOD, J ;
DIFFLEY, JFX .
EMBO JOURNAL, 1995, 14 (11) :2631-2641
[56]  
Schub O, 2001, J BIOL CHEM, V276, P38076
[57]   Nucleotide correlation based measure for identifying origin of replication in genomic sequences [J].
Shah, Kushal ;
Krishnamachari, Annangarachari .
BIOSYSTEMS, 2012, 107 (01) :52-55
[58]   Prediction of replication sites in Saccharomyces cerevisiae genome using DNA segment properties: Multi-view ensemble learning (MEL) approach [J].
Singh, Vinod Kumar ;
Kumar, Vipin ;
Krishnamachari, Annangarachari .
BIOSYSTEMS, 2018, 163 :59-69
[59]   The apicoplast as a potential therapeutic target in Toxoplasma and other Apicomplexan parasites [J].
Soldati, D .
PARASITOLOGY TODAY, 1999, 15 (01) :5-7
[60]   Choosing a suitable method for the identification of replication origins in microbial genomes [J].
Song, Chengcheng ;
Zhang, Shaocun ;
Huang, He .
FRONTIERS IN MICROBIOLOGY, 2015, 6