FIMO: scanning for occurrences of a given motif

被引:2711
作者
Grant, Charles E. [2 ]
Bailey, Timothy L. [1 ]
Noble, William Stafford [2 ,3 ]
机构
[1] Univ Queensland, Inst Mol Biosci, Brisbane, Qld, Australia
[2] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
[3] Univ Washington, Dept Comp Sci & Engn, Seattle, WA 98195 USA
基金
美国国家卫生研究院;
关键词
DISCOVERY;
D O I
10.1093/bioinformatics/btr064
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
A motif is a short DNA or protein sequence that contributes to the biological function of the sequence in which it resides. Over the past several decades, many computational methods have been described for identifying, characterizing and searching with sequence motifs. Critical to nearly any motif-based sequence analysis pipeline is the ability to scan a sequence database for occurrences of a given motif described by a position-specific frequency matrix. Results: We describe Find Individual Motif Occurrences (FIMO), a software tool for scanning DNA or protein sequences with motifs described as position-specific scoring matrices. The program computes a log-likelihood ratio score for each position in a given sequence database, uses established dynamic programming methods to convert this score to a P-value and then applies false discovery rate analysis to estimate a q-value for each position in the given sequence. FIMO provides output in a variety of formats, including HTML, XML and several Santa Cruz Genome Browser formats. The program is efficient, allowing for the scanning of DNA sequences at a rate of 3.5 Mb/s on a single CPU.
引用
收藏
页码:1017 / 1018
页数:2
相关论文
共 7 条
  • [1] Searching for statistically significant regulatory modules
    Bailey, Timothy L.
    Noble, William Stafford
    [J]. BIOINFORMATICS, 2003, 19 : II16 - II25
  • [2] MEME SUITE: tools for motif discovery and searching
    Bailey, Timothy L.
    Boden, Mikael
    Buske, Fabian A.
    Frith, Martin
    Grant, Charles E.
    Clementi, Luca
    Ren, Jingyuan
    Li, Wilfred W.
    Noble, William S.
    [J]. NUCLEIC ACIDS RESEARCH, 2009, 37 : W202 - W208
  • [3] Combining evidence using p-values: application to sequence homology searches
    Bailey, TL
    Gribskov, M
    [J]. BIOINFORMATICS, 1998, 14 (01) : 48 - 54
  • [4] CisML: an XML-based format for sequence motif detection software
    Haverty, PM
    Weng, ZP
    [J]. BIOINFORMATICS, 2004, 20 (11) : 1815 - 1817
  • [5] Staden R, 1994, Methods Mol Biol, V25, P93
  • [6] The positive false discovery rate:: A Bayesian interpretation and the q-value
    Storey, JD
    [J]. ANNALS OF STATISTICS, 2003, 31 (06) : 2013 - 2035
  • [7] A direct approach to false discovery rates
    Storey, JD
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2002, 64 : 479 - 498