An efficient method to transcription factor binding sites imputation via simultaneous completion of multiple matrices with positional consistency

被引:17
|
作者
Guo, Wei-Li [1 ]
Huang, De-Shuang [1 ]
机构
[1] Tongji Univ, Inst Machine Learning & Syst Biol, Sch Elect & Informat Engn, Shanghai 201804, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
CHIP-SEQ; DNA-BINDING; ENCODE; DISCOVERY; NETWORKS; MOTIFS;
D O I
10.1039/c7mb00155j
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Transcription factors (TFs) are DNA-binding proteins that have a central role in regulating gene expression. Identification of DNA-binding sites of TFs is a key task in understanding transcriptional regulation, cellular processes and disease. Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) enables genome-wide identification of in vivo TF binding sites. However, it is still difficult to map every TF in every cell line owing to cost and biological material availability, which poses an enormous obstacle for integrated analysis of gene regulation. To address this problem, we propose a novel computational approach, TFBSImpute, for predicting additional TF binding profiles by leveraging information from available ChIP-seq TF binding data. TFBSImpute fuses the dataset to a 3-mode tensor and imputes missing TF binding signals via simultaneous completion of multiple TF binding matrices with positional consistency. We show that signals predicted by our method achieve overall similarity with experimental data and that TFBSImpute significantly outperforms baseline approaches, by assessing the performance of imputation methods against observed ChIP-seq TF binding profiles. Besides, motif analysis shows that TFBSImpute preforms better in capturing binding motifs enriched in observed data compared with baselines, indicating that the higher performance of TFBSImpute is not simply due to averaging related samples. We anticipate that our approach will constitute a useful complement to experimental mapping of TF binding, which is beneficial for further study of regulation mechanisms and disease.
引用
收藏
页码:1827 / 1837
页数:11
相关论文
共 15 条
  • [1] Positional distribution of transcription factor binding sites in Arabidopsis thaliana
    Yu, Chun-Ping
    Lin, Jinn-Jy
    Li, Wen-Hsiung
    SCIENTIFIC REPORTS, 2016, 6
  • [2] A subspace method for the detection of transcription factor binding sites
    Pairo, Erola
    Maynou, Joan
    Marco, Santiago
    Perera, Alexandre
    BIOINFORMATICS, 2012, 28 (10) : 1328 - 1335
  • [3] Collaborative Completion of Transcription Factor Binding Profiles via Local Sensitive Unified Embedding
    Zhu, Lin
    Guo, Wei-Li
    Lu, Canyi
    Huang, De-Shuang
    IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2016, 15 (08) : 946 - 958
  • [4] Identifying Functional Transcription Factor Binding Sites in Yeast by Considering Their Positional Preference in the Promoters
    Lai, Fu-Jou
    Chiu, Chia-Chun
    Yang, Tzu-Hsien
    Huang, Yueh-Min
    Wu, Wei-Sheng
    PLOS ONE, 2013, 8 (12):
  • [5] Most of the tight positional conservation of transcription factor binding sites near the transcription start site reflects their co-localization within regulatory modules
    Acevedo-Luna, Natalia
    Marino-Ramirez, Leonardo
    Halbert, Armand
    Hansen, Ulla
    Landsman, David
    Spouge, John L.
    BMC BIOINFORMATICS, 2016, 17
  • [6] Assessment of Algorithms for Inferring Positional Weight Matrix Motifs of Transcription Factor Binding Sites Using Protein Binding Microarray Data
    Orenstein, Yaron
    Linhart, Chaim
    Shamir, Ron
    PLOS ONE, 2012, 7 (09):
  • [7] Probabilistic Inference on Multiple Normalized Signal Profiles from Next Generation Sequencing: Transcription Factor Binding Sites
    Wong, Ka-Chun
    Peng, Chengbin
    Li, Yue
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2015, 12 (06) : 1416 - 1428
  • [8] Identification and positional distribution analysis of transcription factor binding sites for genes from the wheat fl-cDNA sequences
    Chen, Zhen-Yong
    Guo, Xiao-Jiang
    Chen, Zhong-Xu
    Chen, Wei-Ying
    Wang, Ji-Rui
    BIOSCIENCE BIOTECHNOLOGY AND BIOCHEMISTRY, 2017, 81 (06) : 1125 - 1135
  • [9] KDeep: a new memory-efficient data extraction method for accurately predicting DNA/RNA transcription factor binding sites
    Abadi, Saeedeh Akbari Rokn
    Tabatabaei, SeyedehFatemeh
    Koohi, Somayyeh
    JOURNAL OF TRANSLATIONAL MEDICINE, 2023, 21 (01)
  • [10] A competitive precision CRISPR method to identify the fitness effects of transcription factor binding sites
    Pihlajamaa, Paivi
    Kauko, Otto
    Sahu, Biswajyoti
    Kivioja, Teemu
    Taipale, Jussi
    NATURE BIOTECHNOLOGY, 2023, 41 (02) : 197 - +