A general approach for discriminative de novo motif discovery from high-throughput data

被引:32
|
作者
Grau, Jan [1 ]
Posch, Stefan [1 ]
Grosse, Ivo [1 ]
Keilwagen, Jens [2 ,3 ]
机构
[1] Univ Halle Wittenberg, Inst Comp Sci, D-06099 Halle, Saale, Germany
[2] Fed Res Ctr Cultivated Plants, Julius Kuhn Inst, Inst Biosafety Plant Biotechnol, D-06484 Quedlinburg, Germany
[3] Leibniz Inst Plant Genet & Crop Plant Res IPK, Dept Mol Genet, D-06466 Seeland Ot Gatersleben, Germany
关键词
PROTEIN-DNA INTERACTIONS; CHIP-SEQ DATA; FACTOR-BINDING SITES; TRANSCRIPTION FACTOR; POSITIONAL INFORMATION; GENOME; SPECIFICITY; RESOLUTION; SEQUENCES; NETWORK;
D O I
10.1093/nar/gkt831
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
De novo motif discovery has been an important challenge of bioinformatics for the past two decades. Since the emergence of high-throughput techniques like ChIP-seq, ChIP-exo and protein-binding microarrays (PBMs), the focus of de novo motif discovery has shifted to runtime and accuracy on large data sets. For this purpose, specialized algorithms have been designed for discovering motifs in ChIP-seq or PBM data. However, none of the existing approaches work perfectly for all three high-throughput techniques. In this article, we propose Dimont, a general approach for fast and accurate de novo motif discovery from high-throughput data. We demonstrate that Dimont yields a higher number of correct motifs from ChIP-seq data than any of the specialized approaches and achieves a higher accuracy for predicting PBM intensities from probe sequence than any of the approaches specifically designed for that purpose. Dimont also reports the expected motifs for several ChIP-exo data sets. Investigating differences between in vitro and in vivo binding, we find that for most transcription factors, the motifs discovered by Dimont are in good accordance between techniques, but we also find notable exceptions. We also observe that modeling intra-motif dependencies may increase accuracy, which indicates that more complex motif models are a worthwhile field of research.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] De novo detection of somatic mutations in high-throughput single-cell profiling data sets
    Francesc Muyas
    Carolin M. Sauer
    Jose Espejo Valle-Inclán
    Ruoyan Li
    Raheleh Rahbari
    Thomas J. Mitchell
    Sahand Hormoz
    Isidro Cortés-Ciriano
    Nature Biotechnology, 2024, 42 : 758 - 767
  • [22] High-throughput discovery metabolomics
    Fuhrer, Tobias
    Zamboni, Nicola
    Current Opinion in Biotechnology, 2015, 31 : 73 - 78
  • [23] High-throughput discovery metabolomics
    Fuhrer, Tobias
    Zamboni, Nicola
    CURRENT OPINION IN BIOTECHNOLOGY, 2015, 31 : 73 - 78
  • [24] Enabling high-throughput discovery
    Vaschetto, M
    Weissbrod, T
    Bodle, D
    Güner, O
    CURRENT OPINION IN DRUG DISCOVERY & DEVELOPMENT, 2003, 6 (03) : 377 - 383
  • [25] High-throughput discovery metabolomics
    Fuhrer, Tobias
    Zamboni, Nicola
    Current Opinion in Biotechnology, 2015, 31 : 73 - 78
  • [26] A High-Throughput Approach for Identification of Novel General Anesthetics
    Lea, Wendy A.
    Xi, Jin
    Jadhav, Ajit
    Lu, Louis
    Austin, Christopher P.
    Simeonov, Anton
    Eckenhoff, Roderic G.
    PLOS ONE, 2009, 4 (09):
  • [27] HIGH-THROUGHPUT REACTION DISCOVERY
    Halford, Bethany
    CHEMICAL & ENGINEERING NEWS, 2011, 89 (37) : 10 - 10
  • [28] A HIGH-THROUGHPUT TANSCRIPTOMICS APPROACH TO BIOMARKER DISCOVERY IN EARLY ARTHRITIS
    Pratt, Arthur G.
    Wilson, Gill
    Swan, Daniel C.
    Young, David A.
    Hilkens, Catharien M.
    Isaacs, John D.
    RHEUMATOLOGY, 2009, 48 : I140 - I140
  • [29] Catalytic activation of esters: A high-throughput approach to reaction discovery
    Newman, Stephen
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2018, 256
  • [30] A Simple, Multidimensional Approach to High-Throughput Discovery of Catalytic Reactions
    Robbins, Daniel W.
    Hartwig, John F.
    SCIENCE, 2011, 333 (6048) : 1423 - 1427