Software for Computing and Annotating Genomic Ranges

被引:2532
作者
Lawrence, Michael [1 ]
Huber, Wolfgang [2 ,3 ]
Pages, Herve [4 ]
Aboyoun, Patrick [4 ]
Carlson, Marc [4 ]
Gentleman, Robert [1 ]
Morgan, Martin T. [4 ]
Carey, Vincent J. [5 ]
机构
[1] Genentech Inc, Bioinformat & Computat Biol, San Francisco, CA 94080 USA
[2] European Mol Biol Lab, Genome Biol Unit, D-69012 Heidelberg, Germany
[3] European Bioinformat Inst, Cambridge, England
[4] Fred Hutchinson Canc Res Ctr, Seattle, WA 98104 USA
[5] Harvard Univ, Brigham & Womens Hosp, Sch Med, Channing Div Network Med, Boston, MA 02115 USA
基金
美国国家卫生研究院;
关键词
BIOCONDUCTOR; PACKAGE;
D O I
10.1371/journal.pcbi.1003118
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.
引用
收藏
页数:10
相关论文
共 15 条
  • [1] MAINTAINING KNOWLEDGE ABOUT TEMPORAL INTERVALS
    ALLEN, JF
    [J]. COMMUNICATIONS OF THE ACM, 1983, 26 (11) : 832 - 843
  • [2] [Anonymous], CHIPSEQ PACKAGE ANAL
  • [3] Cormen T., 2001, Introduction to Algorithms
  • [4] Bioconductor: open software development for computational biology and bioinformatics
    Gentleman, RC
    Carey, VJ
    Bates, DM
    Bolstad, B
    Dettling, M
    Dudoit, S
    Ellis, B
    Gautier, L
    Ge, YC
    Gentry, J
    Hornik, K
    Hothorn, T
    Huber, W
    Iacus, S
    Irizarry, R
    Leisch, F
    Li, C
    Maechler, M
    Rossini, AJ
    Sawitzki, G
    Smith, C
    Smyth, G
    Tierney, L
    Yang, JYH
    Zhang, JH
    [J]. GENOME BIOLOGY, 2004, 5 (10)
  • [5] An integrated software system for analyzing ChIP-chip and ChIP-seq data
    Ji, Hongkai
    Jiang, Hui
    Ma, Wenxiu
    Johnson, David S.
    Myers, Richard M.
    Wong, Wing H.
    [J]. NATURE BIOTECHNOLOGY, 2008, 26 (11) : 1293 - 1300
  • [6] Lawrence M, 2013, VARIANT TOOLS TOOLS
  • [7] rtracklayer: an R package for interfacing with genome browsers
    Lawrence, Michael
    Gentleman, Robert
    Carey, Vincent
    [J]. BIOINFORMATICS, 2009, 25 (14) : 1841 - 1842
  • [8] Novel Low Abundance and Transient RNAs in Yeast Revealed by Tiling Microarrays and Ultra High-Throughput Sequencing Are Not Conserved Across Closely Related Yeast Species
    Lee, Albert
    Hansen, Kasper Daniel
    Bullard, James
    Dudoit, Sandrine
    Sherlock, Gavin
    [J]. PLOS GENETICS, 2008, 4 (12):
  • [9] Mapping short DNA sequencing reads and calling variants using mapping quality scores
    Li, Heng
    Ruan, Jue
    Durbin, Richard
    [J]. GENOME RESEARCH, 2008, 18 (11) : 1851 - 1858
  • [10] Heritable Individual-Specific and Allele-Specific Chromatin Signatures in Humans
    McDaniell, Ryan
    Lee, Bum-Kyu
    Song, Lingyun
    Liu, Zheng
    Boyle, Alan P.
    Erdos, Michael R.
    Scott, Laura J.
    Morken, Mario A.
    Kucera, Katerina S.
    Battenhouse, Anna
    Keefe, Damian
    Collins, Francis S.
    Willard, Huntington F.
    Lieb, Jason D.
    Furey, Terrence S.
    Crawford, Gregory E.
    Iyer, Vishwanath R.
    Birney, Ewan
    [J]. SCIENCE, 2010, 328 (5975) : 235 - 239