Hold out the genome: a roadmap to solving the cis-regulatory code

被引:22
|
作者
de Boer, Carl G. [1 ]
Taipale, Jussi [2 ,3 ,4 ]
机构
[1] Univ British Columbia, Sch Biomed Engn, Vancouver, BC, Canada
[2] Univ Helsinki, Fac Med, Appl Tumor Genom Res Program, Helsinki, Finland
[3] Karolinska Inst, Dept Med Biochem & Biophys, Stockholm, Sweden
[4] Univ Cambridge, Dept Biochem, Cambridge, England
关键词
ENHANCER ACTIVITY MAPS; TRANSCRIPTION FACTORS; SHADOW ENHANCERS; GENE; SEQUENCE; BINDING; EVOLUTION; EXPRESSION; ELEMENTS; MODEL;
D O I
10.1038/s41586-023-06661-w
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Gene expression is regulated by transcription factors that work together to read cis-regulatory DNA sequences. The 'cis-regulatory code' - how cells interpret DNA sequences to determine when, where and how much genes should be expressed - has proven to be exceedingly complex. Recently, advances in the scale and resolution of functional genomics assays and machine learning have enabled substantial progress towards deciphering this code. However, the cis-regulatory code will probably never be solved if models are trained only on genomic sequences; regions of homology can easily lead to overestimation of predictive performance, and our genome is too short and has insufficient sequence diversity to learn all relevant parameters. Fortunately, randomly synthesized DNA sequences enable testing a far larger sequence space than exists in our genomes, and designed DNA sequences enable targeted queries to maximally improve the models. As the same biochemical principles are used to interpret DNA regardless of its source, models trained on these synthetic data can predict genomic activity, often better than genome-trained models. Here we provide an outlook on the field, and propose a roadmap towards solving the cis-regulatory code by a combination of machine learning and massively parallel assays using synthetic DNA.
引用
收藏
页码:41 / 50
页数:10
相关论文
共 50 条
  • [11] Parallel Evolution of Chordate Cis-Regulatory Code for Development
    Doglio, Laura
    Goode, Debbie K.
    Pelleri, Maria C.
    Pauls, Stefan
    Frabetti, Flavia
    Shimeld, Sebastian M.
    Vavouri, Tanya
    Elgar, Greg
    PLOS GENETICS, 2013, 9 (11):
  • [12] Cis-regulatory code of stress-responsive transcription in Arabidopsis thaliana
    Zou, Cheng
    Sun, Kelian
    Mackaluso, Joshua D.
    Seddon, Alexander E.
    Jin, Rong
    Thomashow, Michael F.
    Shiu, Shin-Han
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 (36) : 14992 - 14997
  • [13] The cis-regulatory effects of modern human-specific variants
    Weiss, Carly, V
    Harshman, Lana
    Inoue, Fumitaka
    Fraser, Hunter B.
    Petrov, Dmitri A.
    Ahituv, Nadav
    Gokhman, David
    ELIFE, 2021, 10
  • [14] Widespread long-range cis-regulatory elements in the maize genome
    Ricci, William A.
    Lu, Zefu
    Ji, Lexiang
    Marand, Alexandre P.
    Ethridge, Christina L.
    Murphy, Nathalie G.
    Noshay, Jaclyn M.
    Galli, Mary
    Mejia-Guerra, Maria Katherine
    Colome-Tatche, Maria
    Johannes, Frank
    Rowley, M. Jordan
    Corces, Victor G.
    Zhai, Jixian
    Scanlon, Michael J.
    Buckler, Edward S.
    Gallavotti, Andrea
    Springer, Nathan M.
    Schmitz, Robert J.
    Zhang, Xiaoyu
    NATURE PLANTS, 2019, 5 (12) : 1237 - 1249
  • [15] A compendium and comparative epigenomics analysis of cis-regulatory elements in the pig genome
    Zhao, Yunxia
    Hou, Ye
    Xu, Yueyuan
    Luan, Yu
    Zhou, Huanhuan
    Qi, Xiaolong
    Hu, Mingyang
    Wang, Daoyuan
    Wang, Zhangxu
    Fu, Yuhua
    Li, Jingjin
    Zhang, Saixian
    Chen, Jianhai
    Han, Jianlin
    Li, Xinyun
    Zhao, Shuhong
    NATURE COMMUNICATIONS, 2021, 12 (01)
  • [16] Genome surveyor 2.0: cis-regulatory analysis in Drosophila
    Kazemian, Majid
    Brodsky, Michael H.
    Sinha, Saurabh
    NUCLEIC ACIDS RESEARCH, 2011, 39 : W79 - W85
  • [17] Functional cis-regulatory genomics for systems biology
    Nam, Jongmin
    Dong, Ping
    Tarpine, Ryan
    Istrail, Sorin
    Davidson, Eric H.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2010, 107 (08) : 3930 - 3935
  • [18] Deciphering the multi-scale, quantitative cis-regulatory code
    Kim, Seungsoo
    Wysocka, Joanna
    MOLECULAR CELL, 2023, 83 (03) : 373 - 392
  • [19] Identifying Cis-Regulatory Changes Involved in the Evolution of Aerobic Fermentation in Yeasts
    Lin, Zhenguo
    Wang, Tzi-Yuan
    Tsai, Bing-Shi
    Wu, Fang-Ting
    Yu, Fu-Jung
    Tseng, Yu-Jung
    Sung, Huang-Mo
    Li, Wen-Hsiung
    GENOME BIOLOGY AND EVOLUTION, 2013, 5 (06): : 1065 - 1078
  • [20] Identification of cis-Regulatory Elements in the Mammalian Genome: The cREMaG Database
    Piechota, Marcin
    Korostynski, Michal
    Przewlocki, Ryszard
    PLOS ONE, 2010, 5 (08):