DiNAMO: highly sensitive DNA motif discovery in high-throughput sequencing data

被引:6
|
作者
Saad, Chadi [1 ,2 ]
Noe, Laurent [1 ]
Richard, Hugues [3 ]
Leclerc, Julie [2 ]
Buisine, Marie-Pierre [2 ]
Touzet, Helene [1 ]
Figeac, Martin [4 ]
机构
[1] Univ Lille, CNRS, Inria, CRIStAL,UMR 9189, Lille, France
[2] Univ Lille, Inserm, Lille Univ Hosp, Ctr Rech Jean Pierre AUBERT,JPARC,UMR S 1172, F-59000 Lille, France
[3] Sorbonne Univ, UMR7238, LCQB, F-75005 Paris, France
[4] Univ Lille, Plateau Genom Fonct & Struct, F-59000 Lille, France
来源
BMC BIOINFORMATICS | 2018年 / 19卷
关键词
Motif; DNA; Chip-Seq; CHIP-SEQ; BINDING SITES; FRAMEWORK; GENOMICS; CELLS;
D O I
10.1186/s12859-018-2215-1
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Discovering over-represented approximate motifs in DNA sequences is an essential part of bioinformatics. This topic has been studied extensively because of the increasing number of potential applications. However, it remains a difficult challenge, especially with the huge quantity of data generated by high throughput sequencing technologies. To overcome this problem, existing tools use greedy algorithms and probabilistic approaches to find motifs in reasonable time. Nevertheless these approaches lack sensitivity and have difficulties coping with rare and subtle motifs. Results: We developed DiNAMO (for DNA MOtif), a new software based on an exhaustive and efficient algorithm for IUPAC motif discovery. We evaluated DiNAMO on synthetic and real datasets with two different applications, namely ChIP-seq peaks and Systematic Sequencing Error analysis. DiNAMO proves to compare favorably with other existing methods and is robust to noise. Conclusions: We shown that DiNAMO software can serve as a tool to search for degenerate motifs in an exact manner using IUPAC models. DiNAMO can be used in scanning mode with sliding windows or in fixed position mode, which makes it suitable for numerous potential applications.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] DiNAMO: highly sensitive DNA motif discovery in high-throughput sequencing data
    Chadi Saad
    Laurent Noé
    Hugues Richard
    Julie Leclerc
    Marie-Pierre Buisine
    Hélène Touzet
    Martin Figeac
    BMC Bioinformatics, 19
  • [2] Genome variation discovery with high-throughput sequencing data
    Dalca, Adrian V.
    Brudno, Michael
    BRIEFINGS IN BIOINFORMATICS, 2010, 11 (01) : 3 - 14
  • [3] Targeted High-Throughput DNA Sequencing for Gene Discovery in Retinitis Pigmentosa
    Daiger, Stephen P.
    Sullivan, Lori S.
    Bowne, Sara J.
    Birch, David G.
    Heckenlively, John R.
    Pierce, Eric A.
    Weinstock, George M.
    RETINAL DEGENERATIVE DISEASES: LABORATORY AND THERAPEUTIC INVESTIGATIONS, 2010, 664 : 325 - 331
  • [4] High-throughput DNA sequencing: A genomic data manufacturing process
    Huang, GM
    DNA SEQUENCE, 1999, 10 (03): : 149 - 153
  • [5] Optimizing depth and type of high-throughput sequencing data for microsatellite discovery
    Chapman, Mark A.
    APPLICATIONS IN PLANT SCIENCES, 2019, 7 (11):
  • [6] DNA sequencing in high-throughput neuroanatomy
    Kebschull, Justus M.
    JOURNAL OF CHEMICAL NEUROANATOMY, 2019, 100
  • [7] SNP discovery by high-throughput sequencing in soybean
    Xiaolei Wu
    Chengwei Ren
    Trupti Joshi
    Tri Vuong
    Dong Xu
    Henry T Nguyen
    BMC Genomics, 11
  • [8] SNP discovery by high-throughput sequencing in soybean
    Wu, Xiaolei
    Ren, Chengwei
    Joshi, Trupti
    Vuong, Tri
    Xu, Dong
    Nguyen, Henry T.
    BMC GENOMICS, 2010, 11
  • [9] A general approach for discriminative de novo motif discovery from high-throughput data
    Grau, Jan
    Posch, Stefan
    Grosse, Ivo
    Keilwagen, Jens
    NUCLEIC ACIDS RESEARCH, 2013, 41 (21)
  • [10] Optimizing a MapReduce Module of Preprocessing High-Throughput DNA Sequencing Data
    Chung, Wei-Chun
    Chang, Yu-Jung
    Chen, Chien-Chih
    Lee, Der-Tsai
    Ho, Jan-Ming
    2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,