DiNAMO: highly sensitive DNA motif discovery in high-throughput sequencing data

被引：6

作者：

Saad, Chadi ^{[1
,2
]}

Noe, Laurent ^{[1
]}

Richard, Hugues ^{[3
]}

Leclerc, Julie ^{[2
]}

Buisine, Marie-Pierre ^{[2
]}

Touzet, Helene ^{[1
]}

Figeac, Martin ^{[4
]}

机构：

[1] Univ Lille, CNRS, Inria, CRIStAL,UMR 9189, Lille, France

[2] Univ Lille, Inserm, Lille Univ Hosp, Ctr Rech Jean Pierre AUBERT,JPARC,UMR S 1172, F-59000 Lille, France

[3] Sorbonne Univ, UMR7238, LCQB, F-75005 Paris, France

[4] Univ Lille, Plateau Genom Fonct & Struct, F-59000 Lille, France

来源：

BMC BIOINFORMATICS | 2018年 / 19卷

关键词：

Motif; DNA; Chip-Seq; CHIP-SEQ; BINDING SITES; FRAMEWORK; GENOMICS; CELLS;

D O I：

10.1186/s12859-018-2215-1

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Background: Discovering over-represented approximate motifs in DNA sequences is an essential part of bioinformatics. This topic has been studied extensively because of the increasing number of potential applications. However, it remains a difficult challenge, especially with the huge quantity of data generated by high throughput sequencing technologies. To overcome this problem, existing tools use greedy algorithms and probabilistic approaches to find motifs in reasonable time. Nevertheless these approaches lack sensitivity and have difficulties coping with rare and subtle motifs. Results: We developed DiNAMO (for DNA MOtif), a new software based on an exhaustive and efficient algorithm for IUPAC motif discovery. We evaluated DiNAMO on synthetic and real datasets with two different applications, namely ChIP-seq peaks and Systematic Sequencing Error analysis. DiNAMO proves to compare favorably with other existing methods and is robust to noise. Conclusions: We shown that DiNAMO software can serve as a tool to search for degenerate motifs in an exact manner using IUPAC models. DiNAMO can be used in scanning mode with sliding windows or in fixed position mode, which makes it suitable for numerous potential applications.

引用

页数：10

共 50 条

[1] DiNAMO: highly sensitive DNA motif discovery in high-throughput sequencing data
Chadi Saad
Laurent Noé
Hugues Richard
Julie Leclerc
Marie-Pierre Buisine
Hélène Touzet
Martin Figeac
BMC Bioinformatics, 19
[2] Genome variation discovery with high-throughput sequencing data
Dalca, Adrian V.
Brudno, Michael
BRIEFINGS IN BIOINFORMATICS, 2010, 11 (01) : 3 - 14
[3] Targeted High-Throughput DNA Sequencing for Gene Discovery in Retinitis Pigmentosa
Daiger, Stephen P.
Sullivan, Lori S.
Bowne, Sara J.
Birch, David G.
Heckenlively, John R.
Pierce, Eric A.
Weinstock, George M.
RETINAL DEGENERATIVE DISEASES: LABORATORY AND THERAPEUTIC INVESTIGATIONS, 2010, 664 : 325 - 331
[4] High-throughput DNA sequencing: A genomic data manufacturing process
Huang, GM
DNA SEQUENCE, 1999, 10 (03): : 149 - 153
[5] Optimizing depth and type of high-throughput sequencing data for microsatellite discovery
Chapman, Mark A.
APPLICATIONS IN PLANT SCIENCES, 2019, 7 (11):
[6] DNA sequencing in high-throughput neuroanatomy
Kebschull, Justus M.
JOURNAL OF CHEMICAL NEUROANATOMY, 2019, 100
[7] SNP discovery by high-throughput sequencing in soybean
Xiaolei Wu
Chengwei Ren
Trupti Joshi
Tri Vuong
Dong Xu
Henry T Nguyen
BMC Genomics, 11
[8] SNP discovery by high-throughput sequencing in soybean
Wu, Xiaolei
Ren, Chengwei
Joshi, Trupti
Vuong, Tri
Xu, Dong
Nguyen, Henry T.
BMC GENOMICS, 2010, 11
[9] A general approach for discriminative de novo motif discovery from high-throughput data
Grau, Jan
Posch, Stefan
Grosse, Ivo
Keilwagen, Jens
NUCLEIC ACIDS RESEARCH, 2013, 41 (21)
[10] Optimizing a MapReduce Module of Preprocessing High-Throughput DNA Sequencing Data
Chung, Wei-Chun
Chang, Yu-Jung
Chen, Chien-Chih
Lee, Der-Tsai
Ho, Jan-Ming
2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,

← 1 2 3 4 5 →