Computational prediction and characterization of cell-type-specific and shared binding sites

被引:7
|
作者
Zhang, Qinhu [1 ,2 ]
Teng, Pengrui [3 ]
Wang, Siguo [4 ]
He, Ying [4 ]
Cui, Zhen [4 ]
Guo, Zhenghao [4 ]
Liu, Yixin [5 ]
Yuan, Changan [6 ]
Liu, Qi [1 ,2 ]
Huang, De-Shuang [7 ]
机构
[1] Tongji Univ, Translat Med Ctr Stem Cell Therapy, Shanghai 200092, Peoples R China
[2] Tongji Univ, Shanghai East Hosp, Inst Regenerat Med, Sch Life Sci & Technol,Bioinformat Dept, Shanghai 200092, Peoples R China
[3] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou 221116, Peoples R China
[4] Tongji Univ, Inst Machine Learning & Syst Biol, Sch Elect & Informat Engn, Shanghai 201804, Peoples R China
[5] Univ Shanghai Sci & Technol, Sch Hlth Sci & Engn, Shanghai 200093, Peoples R China
[6] Guangxi Acad Sci, Big Data & Intelligent Comp Res Ctr, Nanning 530007, Peoples R China
[7] EIT Inst Adv Study, Ningbo 315201, Zhejiang, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
CHIP-SEQ; DNA; SEQUENCE; REVEALS;
D O I
10.1093/bioinformatics/btac798
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Cell-type-specific gene expression is maintained in large part by transcription factors (TFs) selectively binding to distinct sets of sites in different cell types. Recent research works have provided evidence that such cell-type-specific binding is determined by TF's intrinsic sequence preferences, cooperative interactions with co-factors, cell-type-specific chromatin landscapes and 3D chromatin interactions. However, computational prediction and characterization of cell-type-specific and shared binding sites is rarely studied. Results: In this article, we propose two computational approaches for predicting and characterizing cell-type-specific and shared binding sites by integrating multiple types of features, in which one is based on XGBoost and another is based on convolutional neural network (CNN). To validate the performance of our proposed approaches, ChIP-seq datasets of 10 binding factors were collected from the GM12878 (lymphoblastoid) and K562 (erythroleukemic) human hematopoietic cell lines, each of which was further categorized into cell-type-specific (GM12878- and K562-specific) and shared binding sites. Then, multiple types of features for these binding sites were integrated to train the XGBoost- and CNN-based models. Experimental results show that our proposed approaches significantly outperform other competing methods on three classification tasks. Moreover, we identified independent feature contributions for cell-type-specific and shared sites through SHAP values and explored the ability of the CNN-based model to predict cell-type-specific and shared binding sites by excluding or including DNase signals. Furthermore, we investigated the generalization ability of our proposed approaches to different binding factors in the same cellular environment.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Computational Prediction of RNA-Binding Proteins and Binding Sites
    Si, Jingna
    Cui, Jing
    Cheng, Jin
    Wu, Rongling
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2015, 16 (11): : 26303 - 26317
  • [2] Accurate prediction of cell type-specific transcription factor binding
    Keilwagen, Jens
    Posch, Stefan
    Grau, Jan
    GENOME BIOLOGY, 2019, 20 (1)
  • [3] Prediction of cell-type-specific cohesin-mediated chromatin loops based on chromatin state
    Liu, Li
    Jia, Ranran
    Hou, Rui
    Huang, Chengbing
    METHODS, 2024, 226 : 151 - 160
  • [4] Freedom of expression: cell-type-specific gene profiling
    Otsuki, Leo
    Cheetham, Seth W.
    Brand, Andrea H.
    WILEY INTERDISCIPLINARY REVIEWS-DEVELOPMENTAL BIOLOGY, 2014, 3 (06) : 429 - 443
  • [5] Prediction of dinucleotide-specific RNA-binding sites in proteins
    Fernandez, Michael
    Kumagai, Yutaro
    Standley, Daron M.
    Sarai, Akinori
    Mizuguchi, Kenji
    Ahmad, Shandar
    BMC BIOINFORMATICS, 2011, 12
  • [6] Accurate prediction of cell type-specific transcription factor binding
    Jens Keilwagen
    Stefan Posch
    Jan Grau
    Genome Biology, 20
  • [7] Prediction of Cell Type Specific Transcription Factor Binding Site Occupancy
    Ahsan, Faizy
    Precup, Doina
    Blanchette, Mathieu
    PROCEEDINGS OF THE 7TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2016, : 497 - 498
  • [8] TChIP-Seq: Cell-Type-Specific Epigenome Profiling
    Mito, Mari
    Kadota, Mitsutaka
    Nakagawa, Shinichi
    Iwasaki, Shintaro
    JOVE-JOURNAL OF VISUALIZED EXPERIMENTS, 2019, (143):
  • [9] Dissecting cell-type-specific metabolism in pancreatic ductal adenocarcinoma
    Lau, Allison N.
    Li, Zhaoqi
    Danai, Laura, V
    Westermark, Anna M.
    Darnell, Alicia M.
    Ferreira, Raphael
    Gocheva, Vasilena
    Sivanand, Sharanya
    Lien, Evan C.
    Sapp, Kiera M.
    Mayers, Jared R.
    Biffi, Giulia
    Chin, Christopher R.
    Davidson, Shawn M.
    Tuveson, David A.
    Jacks, Tyler
    Matheson, Nicholas J.
    Yilmaz, Omer
    Vander Heiden, Matthew G.
    ELIFE, 2020, 9 : 1 - 35
  • [10] A Generic and Cell-Type-Specific Wound Response Precedes Regeneration in Planarians
    Wurtzel, Omri
    Cote, Lauren E.
    Poirier, Amber
    Satija, Rahul
    Regev, Aviv
    Reddien, Peter W.
    DEVELOPMENTAL CELL, 2015, 35 (05) : 632 - 645