Benchmarking and building DNA binding affinity models using allele-specific and allele-agnostic transcription factor binding data

被引:0
|
作者
Li, Xiaoting [1 ]
Melo, Lucas A. N. [1 ]
Bussemaker, Harmen J. [1 ,2 ]
机构
[1] Columbia Univ, Dept Biol Sci, New York, NY 10027 USA
[2] Columbia Univ, Dept Syst Biol, New York, NY 10032 USA
来源
GENOME BIOLOGY | 2024年 / 25卷 / 01期
关键词
Gene expression regulation; Non-coding variants; Transcription factors; Allele-specific binding; ChIP-seq; CTCF; Motif discovery; Biophysically interpretable machine learning; Statistical modeling; ChIP-exo; CUT&Tag; EBF1; PU.1/SPI1; SEQUENCE VARIATION; FACTOR OCCUPANCY; DISEASE; COMMON;
D O I
10.1186/s13059-024-03424-2
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background Transcription factors (TFs) bind to DNA in a highly sequence-specific manner. This specificity manifests itself in vivo as differences in TF occupancy between the two alleles at heterozygous loci. Genome-scale assays such as ChIP-seq currently are limited in their power to detect allele-specific binding (ASB) both in terms of read coverage and representation of individual variants in the cell lines used. This makes prediction of allelic differences in TF binding from sequence alone desirable, provided that the reliability of such predictions can be quantitatively assessed. Results We here propose methods for benchmarking sequence-to-affinity models for TF binding in terms of their ability to predict allelic imbalances in ChIP-seq counts. We use a likelihood function based on an over-dispersed binomial distribution to aggregate evidence for allelic preference across the genome without requiring statistical significance for individual variants. This allows us to systematically compare predictive performance when multiple binding models for the same TF are available. To facilitate the de novo inference of high-quality models from paired-end in vivo binding data such as ChIP-seq, ChIP-exo, and CUT&Tag without read mapping or peak calling, we introduce an extensible reimplementation of our biophysically interpretable machine learning framework named PyProBound. Explicitly accounting for assay-specific bias in DNA fragmentation rate when training on ChIP-seq yields improved TF binding models. Moreover, we show how PyProBound can leverage our threshold-free ASB likelihood function to perform de novo motif discovery using allele-specific ChIP-seq counts. Conclusion Our work provides new strategies for predicting the functional impact of non-coding variants.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Identification of an Allele-Specific Transcription Factor Binding Interaction that May Regulate PLA2G2A Gene Expression
    Hara, Aki
    Lu, Eric
    Johnstone, Laurel
    Wei, Michelle
    Sun, Shudong
    Hallmark, Brian
    Watkins, Joseph C.
    Zhang, Hao Helen
    Yao, Guang
    Chilton, Floyd H.
    BIOINFORMATICS AND BIOLOGY INSIGHTS, 2024, 18
  • [22] An α-synuclein 3′-flanking region SNP interacts with Parkinson's disease susceptibility via allele-specific binding of a transcription factor
    Mizuta, I.
    Satake, W.
    Takafuji, K.
    Kanagawa, M.
    Kobayashi, K.
    Nagamori, S.
    Kanai, Y.
    Yamamoto, M.
    Hattori, N.
    Murata, M.
    Toda, T.
    MOVEMENT DISORDERS, 2011, 26 : S313 - S314
  • [23] PROMISCUOUS AND ALLELE-SPECIFIC ANCHORS IN HLA-DR-BINDING PEPTIDES
    HAMMER, J
    VALSASNINI, P
    TOLBA, K
    BOLIN, D
    HIGELIN, J
    TAKACS, B
    SINIGAGLIA, F
    CELL, 1993, 74 (01) : 197 - 203
  • [24] PH PLAYS A CRITICAL ROLE IN HLA DQ ALLELE-SPECIFIC BINDING
    BUCKNER, JH
    NEPOM, BS
    NEPOM, GT
    KWOK, WW
    ARTHRITIS AND RHEUMATISM, 1995, 38 (09): : 254 - 254
  • [25] Functional polymorphlism in ALOX15 results in increased allele-specific transcription in macrophages through binding of the transcription factor SPI1
    Wittwer, J
    Marti-Jaun, J
    Hersberger, M
    HUMAN MUTATION, 2006, 27 (01) : 78 - 87
  • [26] Allele-specific binding of RNA-binding proteins reveals functional genetic variants in the RNA
    Ei-Wen Yang
    Jae Hoon Bahn
    Esther Yun-Hua Hsiao
    Boon Xin Tan
    Yiwei Sun
    Ting Fu
    Bo Zhou
    Eric L. Van Nostrand
    Gabriel A. Pratt
    Peter Freese
    Xintao Wei
    Giovanni Quinones-Valdez
    Alexander E. Urban
    Brenton R. Graveley
    Christopher B. Burge
    Gene W. Yeo
    Xinshu Xiao
    Nature Communications, 10
  • [27] Allele-specific binding of RNA-binding proteins reveals functional genetic variants in the RNA
    Yang, Ei-Wen
    Bahn, Jae Hoon
    Hsiaol, Esther Yun-Hua
    Tan, Boon Xin
    Sun, Yiwei
    Fu, Ting
    Zhou, Bo
    Van Nostrand, Eric L.
    Pratt, Gabriel A.
    Freese, Peter
    Wei, Xintao
    Quinones-Valdez, Giovanni
    Urban, Alexander E.
    Graveley, Brenton R.
    Burge, Christopher B.
    Yeo, Gene W.
    Xiao, Xinshu
    NATURE COMMUNICATIONS, 2019, 10 (1)
  • [28] SNPs in Sites for DNA Methylation, Transcription Factor Binding, and miRNA Targets Leading to Allele-Specific Gene Expression and Contributing to Complex Disease Risk: A Systematic Review
    Vohra, Manik
    Sharma, Anu Radha
    Prabhu, Navya B.
    Rai, Padmalatha S.
    PUBLIC HEALTH GENOMICS, 2021, 23 (5-6) : 155 - 170
  • [29] Identification of a transcription factor which exhibits allele-specific binding to the proximal promoter region of the HLA DQB1 genes.
    Sukiennicki, T
    Beaty, JS
    Nepom, GT
    FASEB JOURNAL, 1996, 10 (06): : 1023 - 1023
  • [30] Allele specific differences in binding affinity to clip influence peptide selection
    Buckner, JH
    Blom, I
    Van Landeghen, M
    Nepom, GT
    FASEB JOURNAL, 2000, 14 (06): : A1159 - A1159