A computational platform to identify origins of replication sites in eukaryotes

被引:74
作者
Dao, Fu-Ying [1 ]
Lv, Hao [1 ]
Zulfiqar, Hasan [1 ]
Yang, Hui [1 ]
Su, Wei [1 ]
Gao, Hui [1 ]
Ding, Hui [1 ]
Lin, Hao [1 ]
机构
[1] Univ Elect Sci & Technol China, Ctr Informat Biol, Chengdu 610054, Peoples R China
关键词
origins of replication site; eukaryote; feature extraction; webserver; classification algorithm; DNA-REPLICATION; PREDICTION; IDENTIFICATION; SEQUENCES; REVEALS;
D O I
10.1093/bib/bbaa017
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The locations of the initiation of genomic DNA replication are defined as origins of replication sites (ORIs), which regulate the onset of DNA replication and play significant roles in the DNA replication process. The study of ORIs is essential for understanding the cell-division cycle and gene expression regulation. Accurate identification of ORIs will provide important clues for DNA replication research and drug development by developing computational methods. In this paper, the first integrated predictor named iORI-Euk was built to identify ORIs in multiple eukaryotes and multiple cell types. In the predictor, seven eukaryotic (Homo sapiens, Mus musculus, Drosophila melanogaster, Arabidopsis thaliana, Pichia pastoris, Schizosaccharomyces pombe and Kluyveromyces lactis) ORI data was collected from public database to construct benchmark datasets. Subsequently, three feature extraction strategies which are k-mer, binary encoding and combination of k-mer and binary were used to formulate DNA sequence samples. We also compared the different classification algorithms' performance. As a result, the best results were obtained by using support vector machine in 5-fold cross-validation test and independent dataset test. Based on the optimal model, an online web server called iORI-Euk (http://lin-group.cn/server/iO RI- Euk/) was established for the novel ORI identification.
引用
收藏
页码:1940 / 1950
页数:11
相关论文
共 48 条
[21]   Ori-Finder:: A web-based system for finding oriCs in unannotated bacterial genomes [J].
Gao, Feng ;
Zhang, Chun-Ting .
BMC BIOINFORMATICS, 2008, 9 (1)
[22]   DeOri: a database of eukaryotic DNA replication origins [J].
Gao, Feng ;
Luo, Hao ;
Zhang, Chun-Ting .
BIOINFORMATICS, 2012, 28 (11) :1551-1552
[23]   Genome-wide identification and characterisation of human DNA replication origins by initiation site sequencing (ini-seq) [J].
Langley, Alexander R. ;
Graf, Stefan ;
Smith, James C. ;
Krude, Torsten .
NUCLEIC ACIDS RESEARCH, 2016, 44 (21) :10230-10247
[24]   Transcriptional Regulation and Its Misregulation in Disease [J].
Lee, Tong Ihn ;
Young, Richard A. .
CELL, 2013, 152 (06) :1237-1251
[25]   Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences [J].
Li, Weizhong ;
Godzik, Adam .
BIOINFORMATICS, 2006, 22 (13) :1658-1659
[26]   iORI-PseKNC: A predictor for identifying origin of replication with pseudo k-tuple nucleotide composition [J].
Li, Wen-Chao ;
Deng, En-Ze ;
Ding, Hui ;
Chen, Wei ;
Lin, Hao .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2015, 141 :100-106
[27]   Sequence analysis of origins of replication in the Saccharomyces cerevisiae genomes [J].
Li, Wen-Chao ;
Zhong, Zhe-Jin ;
Zhu, Pan-Pan ;
Deng, En-Ze ;
Ding, Hui ;
Chen, Wei ;
Lin, Hao .
FRONTIERS IN MICROBIOLOGY, 2014, 5
[28]   GC-Rich DNA Elements Enable Replication Origin Activity in the Methylotrophic Yeast Pichia pastoris [J].
Liachko, Ivan ;
Youngblood, Rachel A. ;
Tsui, Kyle ;
Bubb, Kerry L. ;
Queitsch, Christine ;
Raghuraman, M. K. ;
Nislow, Corey ;
Brewer, Bonita J. ;
Dunham, Maitreya J. .
PLOS GENETICS, 2014, 10 (03)
[29]   High-resolution mapping, characterization, and optimization of autonomously replicating sequences in yeast [J].
Liachko, Ivan ;
Youngblood, Rachel A. ;
Keich, Uri ;
Dunham, Maitreya J. .
GENOME RESEARCH, 2013, 23 (04) :698-704
[30]   A Comprehensive Genome-Wide Map of Autonomously Replicating Sequences in a Naive Genome [J].
Liachko, Ivan ;
Bhaskar, Anand ;
Lee, Chanmi ;
Chung, Shau Chee Claire ;
Tye, Bik-Kwoon ;
Keich, Uri .
PLOS GENETICS, 2010, 6 (05) :22