Predicting CTCF cell type active binding sites in human genome

被引:0
作者
Chai, Lu [1 ]
Gao, Jie [1 ]
Li, Zihan [1 ]
Sun, Hao [1 ]
Liu, Junjie [1 ]
Wang, Yong [2 ]
Zhang, Lirong [1 ]
机构
[1] Inner Mongolia Univ, Sch Phys Sci & Technol, Hohhot 010021, Peoples R China
[2] Chinese Acad Sci, Acad Math & Syst Sci, CEMS, NCMIS,HCMS,MDIS, Beijing 100190, Peoples R China
来源
SCIENTIFIC REPORTS | 2024年 / 14卷 / 01期
基金
中国国家自然科学基金;
关键词
CTCF binding site; Convolutional neural networks; Chromatin accessibility; RAD21; SMC3; TRANSCRIPTION; DISCOVERY; EXPANSION; TOPOLOGY; PROMOTER;
D O I
10.1038/s41598-024-82238-5
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The CCCTC-binding factor (CTCF) is pivotal in orchestrating diverse biological functions across the human genome, yet the mechanisms driving its cell type-active DNA binding affinity remain underexplored. Here, we collected ChIP-seq data from 67 cell lines in ENCODE, constructed a unique dataset of cell type-active CTCF binding sites (CBS), and trained convolutional neural networks (CNN) to dissect the patterns of CTCF binding activity. Our analysis reveals that transcription factors RAD21/SMC3 and chromatin accessibility are more predictive compared to sequence motifs and histone modifications. Integrating them together achieved AUPRC values consistently above 0.868, highlighting their utility in deciphering CTCF transcription factor binding dynamics. This study provides a deeper understanding of the regulatory functions of CTCF via machine learning framework.
引用
收藏
页数:14
相关论文
共 49 条
  • [1] CTCF as a regulator of alternative splicing: new tricks for an old player
    Alharbi, Adel B.
    Schmitz, Ulf
    Bailey, Charles G.
    Rasko, John E. J.
    [J]. NUCLEIC ACIDS RESEARCH, 2021, 49 (14) : 7825 - 7838
  • [2] Ardakani FB., 2018, FResearch, DOI [10.12688/f1000research.16200.2, DOI 10.12688/F1000RESEARCH.16200.2]
  • [3] MEME SUITE: tools for motif discovery and searching
    Bailey, Timothy L.
    Boden, Mikael
    Buske, Fabian A.
    Frith, Martin
    Grant, Charles E.
    Clementi, Luca
    Ren, Jingyuan
    Li, Wilfred W.
    Noble, William S.
    [J]. NUCLEIC ACIDS RESEARCH, 2009, 37 : W202 - W208
  • [4] Enhancer accessibility and CTCF occupancy underlie asymmetric TAD architecture and cell type specific genome topology
    Barrington, Christopher
    Georgopoulou, Dimitra
    Pezic, Dubravka
    Varsally, Wazeer
    Herrero, Javier
    Hadjur, Suzana
    [J]. NATURE COMMUNICATIONS, 2019, 10 (1)
  • [5] The complex language of chromatin regulation during transcription
    Berger, Shelley L.
    [J]. NATURE, 2007, 447 (7143) : 407 - 412
  • [6] DeepGRN: prediction of transcription factor binding site across cell-types using attention-based deep neural networks
    Chen, Chen
    Hou, Jie
    Shi, Xiaowen
    Yang, Hua
    Birchler, James A.
    Cheng, Jianlin
    [J]. BMC BIOINFORMATICS, 2021, 22 (01)
  • [7] Mocap: large-scale inference of transcription factor binding sites from chromatin accessibility
    Chen, Xi
    Yu, Bowen
    Carriero, Nicholas
    Silva, Claudio
    Bonneau, Richard
    [J]. NUCLEIC ACIDS RESEARCH, 2017, 45 (08) : 4315 - 4329
  • [8] Recent advances in efficient computation of deep convolutional neural networks
    Cheng, Jian
    Wang, Pei-song
    Li, Gang
    Hu, Qing-hao
    Lu, Han-qing
    [J]. FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2018, 19 (01) : 64 - 77
  • [9] Global analysis of the insulator binding protein CTCF in chromatin barrier regions reveals demarcation of active and repressive domains
    Cuddapah, Suresh
    Jothi, Raja
    Schones, Dustin E.
    Roh, Tae-Young
    Cui, Kairong
    Zhao, Keji
    [J]. GENOME RESEARCH, 2009, 19 (01) : 24 - 32
  • [10] The 3D Genome as Moderator of Chromosomal Communication
    Dekker, Job
    Mirny, Leonid
    [J]. CELL, 2016, 164 (06) : 1110 - 1121