A binary search approach to whole-genome data analysis

被引:7
作者
Brodsky, Leonid [1 ]
Kogan, Simon [1 ]
BenJacob, Eshel [2 ]
Nevo, Eviatar [1 ]
机构
[1] Univ Haifa, Inst Evolut, IL-31905 Haifa, Israel
[2] Tel Aviv Univ, Sch Phys & Astron, IL-69978 Tel Aviv, Israel
关键词
genome segmentation; tiling array; next-generation sequencing; MODEL-BASED ANALYSIS; TILING MICROARRAY; CHIP-SEQ; MAP;
D O I
10.1073/pnas.1011134107
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
A sequence analysis-oriented binary search-like algorithm was transformed to a sensitive and accurate analysis tool for processing whole-genome data. The advantage of the algorithm over previous methods is its ability to detect the margins of both short and long genome fragments, enriched by up-regulated signals, at equal accuracy. The score of an enriched genome fragment reflects the difference between the actual concentration of up-regulated signals in the fragment and the chromosome signal baseline. The "divide-and-conquer"-type algorithm detects a series of nonintersecting fragments of various lengths with locally optimal scores. The procedure is applied to detected fragments in a nested manner by recalculating the lower-than-baseline signals in the chromosome. The algorithm was applied to simulated whole-genome data, and its sensitivity/specificity were compared with those of several alternative algorithms. The algorithm was also tested with four biological tiling array datasets comprising Arabidopsis (i) expression and (ii) histone 3 lysine 27 trimethylation CHIP-on-chip datasets; Saccharomyces cerevisiae (iii) spliced intron data and (iv) chromatin remodeling factor binding sites. The analyses' results demonstrate the power of the algorithm in identifying both the short up-regulated fragments (such as exons and transcription factor binding sites) and the long-even moderately up-regulated zones-at their precise genome margins. The algorithm generates an accurate whole-genome landscape that could be used for cross-comparison of signals across the same genome in evolutionary and general genomic studies.
引用
收藏
页码:16893 / 16898
页数:6
相关论文
共 13 条
[1]   Rank-statistics based enrichment-site prediction algorithm developed for chromatin immunoprecipitation on chip experiments [J].
Ghosh, Srinka ;
Hirsch, Heather A. ;
Sekinger, Edward ;
Struhl, Kevin ;
Gingeras, Thomas R. .
BMC BIOINFORMATICS, 2006, 7 (1)
[2]   A large number of novel coding small open reading frames in the intergenic regions of the Arabidopsis thaliana genome are transcribed and/or under purifying selection [J].
Hanada, Kousuke ;
Zhang, Xu ;
Borevitz, Justin O. ;
Li, Wen-Hsiung ;
Shiu, Shin-Han .
GENOME RESEARCH, 2007, 17 (05) :632-640
[3]   TileMap: create chromosomal map of tiling array hybridizations [J].
Ji, HK ;
Wong, WH .
BIOINFORMATICS, 2005, 21 (18) :3629-3636
[4]   Model-based analysis of tiling-arrays for ChIP-chip [J].
Johnson, W. Evan ;
Li, Wei ;
Meyer, Clifford A. ;
Gottardo, Raphael ;
Carroll, Jason S. ;
Brown, Myles ;
Liu, X. Shirley .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (33) :12457-12462
[5]   CONSTRUCTION OF THE FULL LOCAL SIMILARITY MAP FOR 2 BIOPOLYMERS [J].
LEONTOVICH, AM ;
BRODSKY, LI ;
GORBALENYA, AE .
BIOSYSTEMS, 1993, 30 (1-3) :57-63
[6]   Applications of recursive segmentation to the analysis of DNA sequences [J].
Li, WT ;
Bernaola-Galván, P ;
Haghighi, F ;
Grosse, I .
COMPUTERS & CHEMISTRY, 2002, 26 (05) :491-510
[7]   Getting started in tiling Microarray analysis [J].
Liu, X. Shirley .
PLOS COMPUTATIONAL BIOLOGY, 2007, 3 (10) :1842-1844
[8]   ChIP-seq: welcome to the new frontier [J].
Mardis, Elaine R. .
NATURE METHODS, 2007, 4 (08) :613-614
[9]   Genome sequencing in microfabricated high-density picolitre reactors [J].
Margulies, M ;
Egholm, M ;
Altman, WE ;
Attiya, S ;
Bader, JS ;
Bemben, LA ;
Berka, J ;
Braverman, MS ;
Chen, YJ ;
Chen, ZT ;
Dewell, SB ;
Du, L ;
Fierro, JM ;
Gomes, XV ;
Godwin, BC ;
He, W ;
Helgesen, S ;
Ho, CH ;
Irzyk, GP ;
Jando, SC ;
Alenquer, MLI ;
Jarvie, TP ;
Jirage, KB ;
Kim, JB ;
Knight, JR ;
Lanza, JR ;
Leamon, JH ;
Lefkowitz, SM ;
Lei, M ;
Li, J ;
Lohman, KL ;
Lu, H ;
Makhijani, VB ;
McDade, KE ;
McKenna, MP ;
Myers, EW ;
Nickerson, E ;
Nobile, JR ;
Plant, R ;
Puc, BP ;
Ronan, MT ;
Roth, GT ;
Sarkis, GJ ;
Simons, JF ;
Simpson, JW ;
Srinivasan, M ;
Tartaro, KR ;
Tomasz, A ;
Vogt, KA ;
Volkmer, GA .
NATURE, 2005, 437 (7057) :376-380
[10]  
Vostrikova L. J., 1981, Soviet Math Doklady, V24, P55