Benchmarking and building DNA binding affinity models using allele-specific and allele-agnostic transcription factor binding data
被引:0
作者:
Li, Xiaoting
论文数: 0引用数: 0
h-index: 0
机构:
Columbia Univ, Dept Biol Sci, New York, NY 10027 USAColumbia Univ, Dept Biol Sci, New York, NY 10027 USA
Li, Xiaoting
[1
]
Melo, Lucas A. N.
论文数: 0引用数: 0
h-index: 0
机构:
Columbia Univ, Dept Biol Sci, New York, NY 10027 USAColumbia Univ, Dept Biol Sci, New York, NY 10027 USA
Melo, Lucas A. N.
[1
]
Bussemaker, Harmen J.
论文数: 0引用数: 0
h-index: 0
机构:
Columbia Univ, Dept Biol Sci, New York, NY 10027 USA
Columbia Univ, Dept Syst Biol, New York, NY 10032 USAColumbia Univ, Dept Biol Sci, New York, NY 10027 USA
Bussemaker, Harmen J.
[1
,2
]
机构:
[1] Columbia Univ, Dept Biol Sci, New York, NY 10027 USA
[2] Columbia Univ, Dept Syst Biol, New York, NY 10032 USA
Background Transcription factors (TFs) bind to DNA in a highly sequence-specific manner. This specificity manifests itself in vivo as differences in TF occupancy between the two alleles at heterozygous loci. Genome-scale assays such as ChIP-seq currently are limited in their power to detect allele-specific binding (ASB) both in terms of read coverage and representation of individual variants in the cell lines used. This makes prediction of allelic differences in TF binding from sequence alone desirable, provided that the reliability of such predictions can be quantitatively assessed. Results We here propose methods for benchmarking sequence-to-affinity models for TF binding in terms of their ability to predict allelic imbalances in ChIP-seq counts. We use a likelihood function based on an over-dispersed binomial distribution to aggregate evidence for allelic preference across the genome without requiring statistical significance for individual variants. This allows us to systematically compare predictive performance when multiple binding models for the same TF are available. To facilitate the de novo inference of high-quality models from paired-end in vivo binding data such as ChIP-seq, ChIP-exo, and CUT&Tag without read mapping or peak calling, we introduce an extensible reimplementation of our biophysically interpretable machine learning framework named PyProBound. Explicitly accounting for assay-specific bias in DNA fragmentation rate when training on ChIP-seq yields improved TF binding models. Moreover, we show how PyProBound can leverage our threshold-free ASB likelihood function to perform de novo motif discovery using allele-specific ChIP-seq counts. Conclusion Our work provides new strategies for predicting the functional impact of non-coding variants.
机构:
Univ Chicago, Dept Chem, Chicago, IL 60637 USAUniv Chicago, Dept Chem, Chicago, IL 60637 USA
Maienschein-Cline, Mark
Zhou, Jie
论文数: 0引用数: 0
h-index: 0
机构:
Univ Chicago, Dept Human Genet, Chicago, IL 60637 USAUniv Chicago, Dept Chem, Chicago, IL 60637 USA
Zhou, Jie
White, Kevin P.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Chicago, Dept Human Genet, Chicago, IL 60637 USA
Univ Chicago, Inst Genom & Syst Biol, Chicago, IL 60637 USAUniv Chicago, Dept Chem, Chicago, IL 60637 USA
White, Kevin P.
Sciammas, Roger
论文数: 0引用数: 0
h-index: 0
机构:
Univ Chicago, Dept Surg, Chicago, IL 60637 USAUniv Chicago, Dept Chem, Chicago, IL 60637 USA
Sciammas, Roger
Dinner, Aaron R.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Chicago, Dept Chem, Chicago, IL 60637 USA
Univ Chicago, Inst Genom & Syst Biol, Chicago, IL 60637 USAUniv Chicago, Dept Chem, Chicago, IL 60637 USA
机构:
Univ Michigan, Ctr Stat Genet, Ann Arbor, MI 48109 USA
Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USAUniv Michigan, Ctr Stat Genet, Ann Arbor, MI 48109 USA
Hu, Ming
Yu, Jindan
论文数: 0引用数: 0
h-index: 0
机构:
Univ Michigan, Michigan Ctr Translat Pathol, Ann Arbor, MI 48109 USA
Univ Michigan, Dept Pathol, Ann Arbor, MI 48109 USA
Univ Michigan, Ctr Comprehens Canc, Ann Arbor, MI 48109 USA
Northwestern Univ, Dept Med, Div Hematol Oncol, Chicago, IL 60660 USAUniv Michigan, Ctr Stat Genet, Ann Arbor, MI 48109 USA
Yu, Jindan
Taylor, Jeremy M. G.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USA
Univ Michigan, Ctr Comprehens Canc, Ann Arbor, MI 48109 USAUniv Michigan, Ctr Stat Genet, Ann Arbor, MI 48109 USA
Taylor, Jeremy M. G.
Chinnaiyan, Arul M.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Michigan, Michigan Ctr Translat Pathol, Ann Arbor, MI 48109 USA
Univ Michigan, Dept Pathol, Ann Arbor, MI 48109 USA
Univ Michigan, Ctr Comprehens Canc, Ann Arbor, MI 48109 USA
Univ Michigan, Howard Hughes Med Inst, Ann Arbor, MI 48109 USA
Univ Michigan, Dept Urol, Sch Med, Ann Arbor, MI 48109 USAUniv Michigan, Ctr Stat Genet, Ann Arbor, MI 48109 USA
Chinnaiyan, Arul M.
Qin, Zhaohui S.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Michigan, Ctr Stat Genet, Ann Arbor, MI 48109 USA
Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USAUniv Michigan, Ctr Stat Genet, Ann Arbor, MI 48109 USA
机构:
Univ Chicago, Dept Chem, Chicago, IL 60637 USAUniv Chicago, Dept Chem, Chicago, IL 60637 USA
Maienschein-Cline, Mark
Zhou, Jie
论文数: 0引用数: 0
h-index: 0
机构:
Univ Chicago, Dept Human Genet, Chicago, IL 60637 USAUniv Chicago, Dept Chem, Chicago, IL 60637 USA
Zhou, Jie
White, Kevin P.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Chicago, Dept Human Genet, Chicago, IL 60637 USA
Univ Chicago, Inst Genom & Syst Biol, Chicago, IL 60637 USAUniv Chicago, Dept Chem, Chicago, IL 60637 USA
White, Kevin P.
Sciammas, Roger
论文数: 0引用数: 0
h-index: 0
机构:
Univ Chicago, Dept Surg, Chicago, IL 60637 USAUniv Chicago, Dept Chem, Chicago, IL 60637 USA
Sciammas, Roger
Dinner, Aaron R.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Chicago, Dept Chem, Chicago, IL 60637 USA
Univ Chicago, Inst Genom & Syst Biol, Chicago, IL 60637 USAUniv Chicago, Dept Chem, Chicago, IL 60637 USA
机构:
Univ Michigan, Ctr Stat Genet, Ann Arbor, MI 48109 USA
Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USAUniv Michigan, Ctr Stat Genet, Ann Arbor, MI 48109 USA
Hu, Ming
Yu, Jindan
论文数: 0引用数: 0
h-index: 0
机构:
Univ Michigan, Michigan Ctr Translat Pathol, Ann Arbor, MI 48109 USA
Univ Michigan, Dept Pathol, Ann Arbor, MI 48109 USA
Univ Michigan, Ctr Comprehens Canc, Ann Arbor, MI 48109 USA
Northwestern Univ, Dept Med, Div Hematol Oncol, Chicago, IL 60660 USAUniv Michigan, Ctr Stat Genet, Ann Arbor, MI 48109 USA
Yu, Jindan
Taylor, Jeremy M. G.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USA
Univ Michigan, Ctr Comprehens Canc, Ann Arbor, MI 48109 USAUniv Michigan, Ctr Stat Genet, Ann Arbor, MI 48109 USA
Taylor, Jeremy M. G.
Chinnaiyan, Arul M.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Michigan, Michigan Ctr Translat Pathol, Ann Arbor, MI 48109 USA
Univ Michigan, Dept Pathol, Ann Arbor, MI 48109 USA
Univ Michigan, Ctr Comprehens Canc, Ann Arbor, MI 48109 USA
Univ Michigan, Howard Hughes Med Inst, Ann Arbor, MI 48109 USA
Univ Michigan, Dept Urol, Sch Med, Ann Arbor, MI 48109 USAUniv Michigan, Ctr Stat Genet, Ann Arbor, MI 48109 USA
Chinnaiyan, Arul M.
Qin, Zhaohui S.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Michigan, Ctr Stat Genet, Ann Arbor, MI 48109 USA
Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USAUniv Michigan, Ctr Stat Genet, Ann Arbor, MI 48109 USA