Computational prediction and characterization of cell-type-specific and shared binding sites

被引:7
|
作者
Zhang, Qinhu [1 ,2 ]
Teng, Pengrui [3 ]
Wang, Siguo [4 ]
He, Ying [4 ]
Cui, Zhen [4 ]
Guo, Zhenghao [4 ]
Liu, Yixin [5 ]
Yuan, Changan [6 ]
Liu, Qi [1 ,2 ]
Huang, De-Shuang [7 ]
机构
[1] Tongji Univ, Translat Med Ctr Stem Cell Therapy, Shanghai 200092, Peoples R China
[2] Tongji Univ, Shanghai East Hosp, Inst Regenerat Med, Sch Life Sci & Technol,Bioinformat Dept, Shanghai 200092, Peoples R China
[3] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou 221116, Peoples R China
[4] Tongji Univ, Inst Machine Learning & Syst Biol, Sch Elect & Informat Engn, Shanghai 201804, Peoples R China
[5] Univ Shanghai Sci & Technol, Sch Hlth Sci & Engn, Shanghai 200093, Peoples R China
[6] Guangxi Acad Sci, Big Data & Intelligent Comp Res Ctr, Nanning 530007, Peoples R China
[7] EIT Inst Adv Study, Ningbo 315201, Zhejiang, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
CHIP-SEQ; DNA; SEQUENCE; REVEALS;
D O I
10.1093/bioinformatics/btac798
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Cell-type-specific gene expression is maintained in large part by transcription factors (TFs) selectively binding to distinct sets of sites in different cell types. Recent research works have provided evidence that such cell-type-specific binding is determined by TF's intrinsic sequence preferences, cooperative interactions with co-factors, cell-type-specific chromatin landscapes and 3D chromatin interactions. However, computational prediction and characterization of cell-type-specific and shared binding sites is rarely studied. Results: In this article, we propose two computational approaches for predicting and characterizing cell-type-specific and shared binding sites by integrating multiple types of features, in which one is based on XGBoost and another is based on convolutional neural network (CNN). To validate the performance of our proposed approaches, ChIP-seq datasets of 10 binding factors were collected from the GM12878 (lymphoblastoid) and K562 (erythroleukemic) human hematopoietic cell lines, each of which was further categorized into cell-type-specific (GM12878- and K562-specific) and shared binding sites. Then, multiple types of features for these binding sites were integrated to train the XGBoost- and CNN-based models. Experimental results show that our proposed approaches significantly outperform other competing methods on three classification tasks. Moreover, we identified independent feature contributions for cell-type-specific and shared sites through SHAP values and explored the ability of the CNN-based model to predict cell-type-specific and shared binding sites by excluding or including DNase signals. Furthermore, we investigated the generalization ability of our proposed approaches to different binding factors in the same cellular environment.
引用
收藏
页数:9
相关论文
共 50 条
  • [41] High-throughput chromatin information enables accurate tissue-specific prediction of transcription factor binding sites
    Whitington, Tom
    Perkins, Andrew C.
    Bailey, Timothy L.
    NUCLEIC ACIDS RESEARCH, 2009, 37 (01) : 14 - 25
  • [42] MixChIP: a probabilistic method for cell type specific protein-DNA binding analysis
    Sini Rautio
    Harri Lähdesmäki
    BMC Bioinformatics, 16
  • [43] MixChIP: a probabilistic method for cell type specific protein-DNA binding analysis
    Rautio, Sini
    Lahdesmaki, Harri
    BMC BIOINFORMATICS, 2015, 16
  • [44] A microRNA-inducible CRISPR-Cas9 platform serves as a microRNA sensor and cell-type-specific genome regulation tool
    Wang, Xi-Wen
    Hu, Lu-Feng
    Hao, Jing
    Liao, Le-Qi
    Chiu, Ya-Tzu
    Shi, Ming
    Wang, Yangming
    NATURE CELL BIOLOGY, 2019, 21 (04) : 522 - +
  • [45] Identification of universal and cell-type specific p53 DNA binding
    Antonina Hafner
    Lyubov Kublo
    Michael Tsabar
    Galit Lahav
    Jacob Stewart-Ornstein
    BMC Molecular and Cell Biology, 21
  • [46] Genome-Wide Maps of m6A circRNAs Identify Widespread and Cell-Type-Specific Methylation Patterns that Are Distinct from mRNAs
    Zhou, Chan
    Molinie, Benoit
    Daneshvar, Kaveh
    Pondick, Joshua V.
    Wang, Jinkai
    Van Wittenberghe, Nicholas
    Xing, Yi
    Giallourakis, Cosmas C.
    Mullen, Alan C.
    CELL REPORTS, 2017, 20 (09): : 2262 - 2276
  • [47] Mapping Mammalian Cell-type-specific Transcriptional Regulatory Networks Using KD-CAGE and ChIP-seq Data in the TC-YIK Cell Line
    Lizio, Marina
    Ishizu, Yuri
    Itoh, Masayoshi
    Lassmann, Timo
    Hasegawa, Akira
    Kubosaki, Atsutaka
    Severin, Jessica
    Kawaji, Hideya
    Nakamura, Yukio
    Suzuki, Harukazu
    Hayashizaki, Yoshihide
    Carninci, Piero
    Forrest, Alistair R. R.
    FRONTIERS IN GENETICS, 2015, 6
  • [48] Analysis of chromatin-state plasticity identifies cell-type-specific regulators of H3K27me3 patterns
    Pinello, Luca
    Xu, Jian
    Orkin, Stuart H.
    Yuan, Guo-Cheng
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2014, 111 (03) : E344 - E353
  • [49] BIPSPI plus : Mining Type-Specific Datasets of Protein Complexes to Improve Protein Binding Site Prediction
    Sanchez-Garcia, R.
    Macias, J. R.
    Sorzano, C. O. S.
    Carazo, J. M.
    Segura, J.
    JOURNAL OF MOLECULAR BIOLOGY, 2022, 434 (11)
  • [50] SCRIBER: accurate and partner type-specific prediction of protein-binding residues from proteins sequences
    Zhang, Jian
    Kurgan, Lukasz
    BIOINFORMATICS, 2019, 35 (14) : I343 - I353