A binary search approach to whole-genome data analysis

被引:7
作者
Brodsky, Leonid [1 ]
Kogan, Simon [1 ]
BenJacob, Eshel [2 ]
Nevo, Eviatar [1 ]
机构
[1] Univ Haifa, Inst Evolut, IL-31905 Haifa, Israel
[2] Tel Aviv Univ, Sch Phys & Astron, IL-69978 Tel Aviv, Israel
关键词
genome segmentation; tiling array; next-generation sequencing; MODEL-BASED ANALYSIS; TILING MICROARRAY; CHIP-SEQ; MAP;
D O I
10.1073/pnas.1011134107
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
A sequence analysis-oriented binary search-like algorithm was transformed to a sensitive and accurate analysis tool for processing whole-genome data. The advantage of the algorithm over previous methods is its ability to detect the margins of both short and long genome fragments, enriched by up-regulated signals, at equal accuracy. The score of an enriched genome fragment reflects the difference between the actual concentration of up-regulated signals in the fragment and the chromosome signal baseline. The "divide-and-conquer"-type algorithm detects a series of nonintersecting fragments of various lengths with locally optimal scores. The procedure is applied to detected fragments in a nested manner by recalculating the lower-than-baseline signals in the chromosome. The algorithm was applied to simulated whole-genome data, and its sensitivity/specificity were compared with those of several alternative algorithms. The algorithm was also tested with four biological tiling array datasets comprising Arabidopsis (i) expression and (ii) histone 3 lysine 27 trimethylation CHIP-on-chip datasets; Saccharomyces cerevisiae (iii) spliced intron data and (iv) chromatin remodeling factor binding sites. The analyses' results demonstrate the power of the algorithm in identifying both the short up-regulated fragments (such as exons and transcription factor binding sites) and the long-even moderately up-regulated zones-at their precise genome margins. The algorithm generates an accurate whole-genome landscape that could be used for cross-comparison of signals across the same genome in evolutionary and general genomic studies.
引用
收藏
页码:16893 / 16898
页数:6
相关论文
共 13 条
  • [1] Rank-statistics based enrichment-site prediction algorithm developed for chromatin immunoprecipitation on chip experiments
    Ghosh, Srinka
    Hirsch, Heather A.
    Sekinger, Edward
    Struhl, Kevin
    Gingeras, Thomas R.
    [J]. BMC BIOINFORMATICS, 2006, 7 (1)
  • [2] A large number of novel coding small open reading frames in the intergenic regions of the Arabidopsis thaliana genome are transcribed and/or under purifying selection
    Hanada, Kousuke
    Zhang, Xu
    Borevitz, Justin O.
    Li, Wen-Hsiung
    Shiu, Shin-Han
    [J]. GENOME RESEARCH, 2007, 17 (05) : 632 - 640
  • [3] TileMap: create chromosomal map of tiling array hybridizations
    Ji, HK
    Wong, WH
    [J]. BIOINFORMATICS, 2005, 21 (18) : 3629 - 3636
  • [4] Model-based analysis of tiling-arrays for ChIP-chip
    Johnson, W. Evan
    Li, Wei
    Meyer, Clifford A.
    Gottardo, Raphael
    Carroll, Jason S.
    Brown, Myles
    Liu, X. Shirley
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (33) : 12457 - 12462
  • [5] CONSTRUCTION OF THE FULL LOCAL SIMILARITY MAP FOR 2 BIOPOLYMERS
    LEONTOVICH, AM
    BRODSKY, LI
    GORBALENYA, AE
    [J]. BIOSYSTEMS, 1993, 30 (1-3) : 57 - 63
  • [6] Applications of recursive segmentation to the analysis of DNA sequences
    Li, WT
    Bernaola-Galván, P
    Haghighi, F
    Grosse, I
    [J]. COMPUTERS & CHEMISTRY, 2002, 26 (05): : 491 - 510
  • [7] Getting started in tiling Microarray analysis
    Liu, X. Shirley
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2007, 3 (10) : 1842 - 1844
  • [8] ChIP-seq: welcome to the new frontier
    Mardis, Elaine R.
    [J]. NATURE METHODS, 2007, 4 (08) : 613 - 614
  • [9] Genome sequencing in microfabricated high-density picolitre reactors
    Margulies, M
    Egholm, M
    Altman, WE
    Attiya, S
    Bader, JS
    Bemben, LA
    Berka, J
    Braverman, MS
    Chen, YJ
    Chen, ZT
    Dewell, SB
    Du, L
    Fierro, JM
    Gomes, XV
    Godwin, BC
    He, W
    Helgesen, S
    Ho, CH
    Irzyk, GP
    Jando, SC
    Alenquer, MLI
    Jarvie, TP
    Jirage, KB
    Kim, JB
    Knight, JR
    Lanza, JR
    Leamon, JH
    Lefkowitz, SM
    Lei, M
    Li, J
    Lohman, KL
    Lu, H
    Makhijani, VB
    McDade, KE
    McKenna, MP
    Myers, EW
    Nickerson, E
    Nobile, JR
    Plant, R
    Puc, BP
    Ronan, MT
    Roth, GT
    Sarkis, GJ
    Simons, JF
    Simpson, JW
    Srinivasan, M
    Tartaro, KR
    Tomasz, A
    Vogt, KA
    Volkmer, GA
    [J]. NATURE, 2005, 437 (7057) : 376 - 380
  • [10] Vostrikova L. J., 1981, Soviet Math Doklady, V24, P55