Computational prediction and characterization of cell-type-specific and shared binding sites

被引:7
|
作者
Zhang, Qinhu [1 ,2 ]
Teng, Pengrui [3 ]
Wang, Siguo [4 ]
He, Ying [4 ]
Cui, Zhen [4 ]
Guo, Zhenghao [4 ]
Liu, Yixin [5 ]
Yuan, Changan [6 ]
Liu, Qi [1 ,2 ]
Huang, De-Shuang [7 ]
机构
[1] Tongji Univ, Translat Med Ctr Stem Cell Therapy, Shanghai 200092, Peoples R China
[2] Tongji Univ, Shanghai East Hosp, Inst Regenerat Med, Sch Life Sci & Technol,Bioinformat Dept, Shanghai 200092, Peoples R China
[3] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou 221116, Peoples R China
[4] Tongji Univ, Inst Machine Learning & Syst Biol, Sch Elect & Informat Engn, Shanghai 201804, Peoples R China
[5] Univ Shanghai Sci & Technol, Sch Hlth Sci & Engn, Shanghai 200093, Peoples R China
[6] Guangxi Acad Sci, Big Data & Intelligent Comp Res Ctr, Nanning 530007, Peoples R China
[7] EIT Inst Adv Study, Ningbo 315201, Zhejiang, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
CHIP-SEQ; DNA; SEQUENCE; REVEALS;
D O I
10.1093/bioinformatics/btac798
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Cell-type-specific gene expression is maintained in large part by transcription factors (TFs) selectively binding to distinct sets of sites in different cell types. Recent research works have provided evidence that such cell-type-specific binding is determined by TF's intrinsic sequence preferences, cooperative interactions with co-factors, cell-type-specific chromatin landscapes and 3D chromatin interactions. However, computational prediction and characterization of cell-type-specific and shared binding sites is rarely studied. Results: In this article, we propose two computational approaches for predicting and characterizing cell-type-specific and shared binding sites by integrating multiple types of features, in which one is based on XGBoost and another is based on convolutional neural network (CNN). To validate the performance of our proposed approaches, ChIP-seq datasets of 10 binding factors were collected from the GM12878 (lymphoblastoid) and K562 (erythroleukemic) human hematopoietic cell lines, each of which was further categorized into cell-type-specific (GM12878- and K562-specific) and shared binding sites. Then, multiple types of features for these binding sites were integrated to train the XGBoost- and CNN-based models. Experimental results show that our proposed approaches significantly outperform other competing methods on three classification tasks. Moreover, we identified independent feature contributions for cell-type-specific and shared sites through SHAP values and explored the ability of the CNN-based model to predict cell-type-specific and shared binding sites by excluding or including DNase signals. Furthermore, we investigated the generalization ability of our proposed approaches to different binding factors in the same cellular environment.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] Structure-based computational analysis of protein binding sites for function and druggability prediction
    Nisius, Britta
    Sha, Fan
    Gohlke, Holger
    JOURNAL OF BIOTECHNOLOGY, 2012, 159 (03) : 123 - 134
  • [22] WAPL maintains a cohesin loading cycle to preserve cell-type-specific distal gene regulation
    Liu, Ning Qing
    Maresca, Michela
    van den Brand, Teun
    Braccioli, Luca
    Schijns, Marijne M. G. A.
    Teunissen, Hans
    Bruneau, Benoit G.
    Nora, Elphege P.
    de Wit, Elzo
    NATURE GENETICS, 2021, 53 (01) : 100 - 109
  • [23] Molecular pathomechanisms and cell-type-specific disease phenotypes of MELAS caused by mutant mitochondrial tRNATrp
    Hatakeyama, Hideyuki
    Katayama, Ayako
    Komaki, Hirofumi
    Nishino, Ichizo
    Goto, Yu-ichi
    ACTA NEUROPATHOLOGICA COMMUNICATIONS, 2015, 3 : 52
  • [24] Cell-Type-Specific Gene Regulatory Networks Underlying Murine Neonatal Heart Regeneration at Single-Cell Resolution
    Wang, Zhaoning
    Cui, Miao
    Shah, Akansha M.
    Tan, Wei
    Liu, Ning
    Bassel-Duby, Rhonda
    Olson, Eric N.
    CELL REPORTS, 2020, 33 (10):
  • [25] Analysis of cell-type-specific chromatin modifications and gene expression in Drosophila neurons that direct reproductive behavior
    Palmateer, Colleen M.
    Moseley, Shawn C.
    Ray, Surjyendu
    Brovero, Savannah G.
    Arbeitman, Michelle N.
    PLOS GENETICS, 2021, 17 (04):
  • [26] COUP-TFII is a modulator of cell-type-specific genetic programs based on genomic localization maps
    Erdos, Edina
    Balint, Balint Laszlo
    JOURNAL OF BIOTECHNOLOGY, 2019, 301 : 11 - 17
  • [27] Deciphering Cell-Type-Specific Gene Expression Signatures of Cardiac Diseases Through Reconstruction of Bulk Transcriptomes
    Wu, Xiaobin
    Zhao, Xingyu
    Xiong, Yufei
    Zheng, Ming
    Zhong, Chao
    Zhou, Yuan
    FRONTIERS IN CELL AND DEVELOPMENTAL BIOLOGY, 2022, 10
  • [28] Islet cells share promoter hypomethylation independently of expression, but exhibit cell-type-specific methylation in enhancers
    Neiman, Daniel
    Moss, Joshua
    Hecht, Merav
    Magenheim, Judith
    Piyanzin, Sheina
    Shapiro, A. M. James
    de Koning, Eelco J. P.
    Razin, Aharon
    Cedar, Howard
    Shemer, Ruth
    Dor, Yuval
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2017, 114 (51) : 13525 - 13530
  • [29] INTACT vs. FANS for Cell-Type-Specific Nuclei Sorting: A Comprehensive Qualitative and Quantitative Comparison
    Chongtham, Monika Chanu
    Butto, Tamer
    Mungikar, Kanak
    Gerber, Susanne
    Winter, Jennifer
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2021, 22 (10)
  • [30] Computational prediction of transcription factor binding sites based on an integrative approach incorporating genomic and epigenomic features
    Seok, Ho-Sik
    Kim, Jaebum
    GENES & GENOMICS, 2014, 36 (01) : 25 - 30