DeepCAGE: Incorporating Transcription Factors in Genome- wide Prediction of Chromatin Accessibility

被引:8
作者
Liu, Qiao [1 ,2 ,3 ,4 ]
Hua, Kui [1 ,2 ,3 ]
Zhang, Xuegong [1 ,2 ,3 ]
Wong, Wing Hung [4 ]
Jiang, Rui [1 ,2 ,3 ]
机构
[1] Tsinghua Univ, Minist Educ, Key Lab Bioinformat, Beijing 100084, Peoples R China
[2] Tsinghua Univ, Bioinformat Div, Beijing Natl Res Ctr Informat Sci & Technol, Beijing 100084, Peoples R China
[3] Tsinghua Univ, Ctr Synthet & Syst Biol, Dept Automat, Beijing 100084, Peoples R China
[4] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
基金
美国国家卫生研究院; 中国国家自然科学基金; 国家重点研发计划;
关键词
Chromatin accessibility; Deep learning; Transcription factor; Gene expression; NONCODING VARIANTS; GENE-EXPRESSION; SEQUENCE; DATABASE; NETWORKS; ELEMENTS;
D O I
10.1016/j.gpb.2021.08.015
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Although computational approaches have been complementing high-throughput biological experiments for the identification of functional regions in the human genome, it remains a great challenge to systematically decipher interactions between transcription factors (TFs) and regulatory elements to achieve interpretable annotations of chromatin accessibility across diverse cellular contexts. To solve this problem, we propose DeepCAGE, a deep learning framework that integrates sequence information and binding statuses of TFs, for the accurate prediction of chromatin accessible regions at a genome-wide scale in a variety of cell types. DeepCAGE takes advantage of a densely connected deep convolutional neural network architecture to automatically learn sequence signatures of known chromatin accessible regions and then incorporates such features with expression levels and binding activities of human core TFs to predict novel chromatin accessible regions. In a series of systematic comparisons with existing methods, DeepCAGE exhibits superior performance in not only the classification but also the regression of chromatin accessibility signals. In a detailed analysis of TF activities, DeepCAGE successfully extracts novel binding motifs and measures the contribution of a TF to the regulation with respect to a specific locus in a certain cell type. When applied to whole-genome sequencing data analysis, our method successfully prioritizes putative deleterious variants underlying a human complex trait and thus provides insights into the understanding of disease-associated genetic variants. DeepCAGE can be downloaded from https://github.com/kimmo1019/DeepCAGE.
引用
收藏
页码:496 / 507
页数:12
相关论文
共 51 条
  • [1] The role of GHR and IGF1 genes in the genetic determination of African pygmies' short stature
    Becker, Noemie S. A.
    Verdu, Paul
    Georges, Myriam
    Duquesnoy, Philippe
    Froment, Alain
    Amselem, Serge
    Le Bouc, Yves
    Heyer, Evelyne
    [J]. EUROPEAN JOURNAL OF HUMAN GENETICS, 2013, 21 (06) : 653 - 658
  • [2] Buenrostro JD, 2013, NAT METHODS, V10, P1213, DOI [10.1038/nmeth.2688, 10.1038/NMETH.2688]
  • [3] OpenAnnotate: a web server to annotate the chromatin accessibility of genomic regions
    Chen, Shengquan
    Liu, Qiao
    Cui, Xuejian
    Feng, Zhanying
    Li, Chunquan
    Wang, Xiaowo
    Zhang, Xuegong
    Wang, Yong
    Jiang, Rui
    [J]. NUCLEIC ACIDS RESEARCH, 2021, 49 (W1) : W483 - W490
  • [4] RA3 is a reference-guided approach for epigenetic characterization of single cells
    Chen, Shengquan
    Yan, Guanao
    Zhang, Wenyu
    Li, Jinzhao
    Jiang, Rui
    Lin, Zhixiang
    [J]. NATURE COMMUNICATIONS, 2021, 12 (01)
  • [5] Chromatin modifiers and remodellers: regulators of cellular differentiation
    Chen, Taiping
    Dent, Sharon Y. R.
    [J]. NATURE REVIEWS GENETICS, 2014, 15 (02) : 93 - 106
  • [6] Cell type annotation of single-cell chromatin accessibility data via supervised Bayesian embedding
    Chen, Xiaoyang
    Chen, Shengquan
    Song, Shuang
    Gao, Zijing
    Hou, Lin
    Zhang, Xuegong
    Lv, Hairong
    Jiang, Rui
    [J]. NATURE MACHINE INTELLIGENCE, 2022, 4 (02) : 116 - 126
  • [7] CHENG T, 1994, J BIOL CHEM, V269, P30848
  • [8] The chromatin accessibility landscape of primary human cancers
    Corces, M. Ryan
    Granja, Jeffrey M.
    Shams, Shadi
    Louie, Bryan H.
    Seoane, Jose A.
    Zhou, Wanding
    Silva, Tiago C.
    Groeneveld, Clarice
    Wong, Christopher K.
    Cho, Seung Woo
    Satpathy, Ansuman T.
    Mumbach, Maxwell R.
    Hoadley, Katherine A.
    Robertson, A. Gordon
    Sheffield, Nathan C.
    Felau, Ina
    Castro, Mauro A. A.
    Berman, Benjamin P.
    Staudt, Louis M.
    Zenklusen, Jean C.
    Laird, Peter W.
    Curtis, Christina
    Greenleaf, William J.
    Chang, Howard Y.
    [J]. SCIENCE, 2018, 362 (6413) : 420 - +
  • [9] Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS)
    Crawford, GE
    Holt, IE
    Whittle, J
    Webb, BD
    Tai, D
    Davis, S
    Margulies, EH
    Chen, YD
    Bernat, JA
    Ginsburg, D
    Zhou, DX
    Luo, SJ
    Vasicek, TJ
    Daly, MJ
    Wolfsberg, TG
    Collins, FS
    [J]. GENOME RESEARCH, 2006, 16 (01) : 123 - 131
  • [10] An integrated encyclopedia of DNA elements in the human genome
    Dunham, Ian
    Kundaje, Anshul
    Aldred, Shelley F.
    Collins, Patrick J.
    Davis, CarrieA.
    Doyle, Francis
    Epstein, Charles B.
    Frietze, Seth
    Harrow, Jennifer
    Kaul, Rajinder
    Khatun, Jainab
    Lajoie, Bryan R.
    Landt, Stephen G.
    Lee, Bum-Kyu
    Pauli, Florencia
    Rosenbloom, Kate R.
    Sabo, Peter
    Safi, Alexias
    Sanyal, Amartya
    Shoresh, Noam
    Simon, Jeremy M.
    Song, Lingyun
    Trinklein, Nathan D.
    Altshuler, Robert C.
    Birney, Ewan
    Brown, James B.
    Cheng, Chao
    Djebali, Sarah
    Dong, Xianjun
    Dunham, Ian
    Ernst, Jason
    Furey, Terrence S.
    Gerstein, Mark
    Giardine, Belinda
    Greven, Melissa
    Hardison, Ross C.
    Harris, Robert S.
    Herrero, Javier
    Hoffman, Michael M.
    Iyer, Sowmya
    Kellis, Manolis
    Khatun, Jainab
    Kheradpour, Pouya
    Kundaje, Anshul
    Lassmann, Timo
    Li, Qunhua
    Lin, Xinying
    Marinov, Georgi K.
    Merkel, Angelika
    Mortazavi, Ali
    [J]. NATURE, 2012, 489 (7414) : 57 - 74