Multi-study inference of regulatory networks for more accurate models of gene regulation

被引:47
作者
Castro, Dayanne M. [1 ]
de Veaux, Nicholas R. [2 ]
Miraldi, Emily R. [3 ,4 ,5 ]
Bonneau, Richard [1 ,2 ]
机构
[1] NYU, 550 1St Ave, New York, NY 10003 USA
[2] Flatiron Inst, Ctr Computat Biol, New York, NY 10010 USA
[3] Univ Cincinnati, Dept Pediat, Coll Med, Cincinnati, OH 45229 USA
[4] Cincinnati Childrens Hosp, Div Immunobiol, Cincinnati, OH 45229 USA
[5] Cincinnati Childrens Hosp, Div Biomed Informat, Cincinnati, OH 45229 USA
关键词
TRANSCRIPTIONAL REGULATION; OPEN CHROMATIN; EXPRESSION; RECONSTRUCTION; ARCHITECTURE; LANDSCAPE; SELECTION; DATABASE; LASSO;
D O I
10.1371/journal.pcbi.1006591
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Gene regulatory networks are composed of sub-networks that are often shared across biological processes, cell-types, and organisms. Leveraging multiple sources of information, such as publicly available gene expression datasets, could therefore be helpful when learning a network of interest. Integrating data across different studies, however, raises numerous technical concerns. Hence, a common approach in network inference, and broadly in genomics research, is to separately learn models from each dataset and combine the results. Individual models, however, often suffer from under-sampling, poor generalization and limited network recovery. In this study, we explore previous integration strategies, such as batch-correction and model ensembles, and introduce a new multitask learning approach for joint network inference across several datasets. Our method initially estimates the activities of transcription factors, and subsequently, infers the relevant network topology. As regulatory interactions are context-dependent, we estimate model coefficients as a combination of both dataset-specific and conserved components. In addition, adaptive penalties may be used to favor models that include interactions derived from multiple sources of prior knowledge including orthogonal genomics experiments. We evaluate generalization and network recovery using examples from Bacillus subtilis and Saccharomyces cerevisiae, and show that sharing information across models improves network reconstruction. Finally, we demonstrate robustness to both false positives in the prior information and heterogeneity among datasets.
引用
收藏
页数:22
相关论文
共 80 条
[1]  
[Anonymous], 2013, ADV NEURAL INF PROCE
[2]  
[Anonymous], NATURE REV GENETICS
[3]  
Arnone MI, 1997, DEVELOPMENT, V124, P1851
[4]   An experimentally supported model of the Bacillus subtilis global transcriptional regulatory network [J].
Arrieta-Ortiz, Mario L. ;
Hafemeister, Christoph ;
Bate, Ashley Rose ;
Chu, Timothy ;
Greenfield, Alex ;
Shuster, Bentley ;
Barry, Samantha N. ;
Gallitto, Matthew ;
Liu, Brian ;
Kacmarczyk, Thadeous ;
Santoriello, Francis ;
Chen, Jie ;
Rodrigues, Christopher D. A. ;
Sato, Tsutomu ;
Rudner, David Z. ;
Driks, Adam ;
Bonneau, Richard ;
Eichenberger, Patrick .
MOLECULAR SYSTEMS BIOLOGY, 2015, 11 (11)
[5]   Statistical Design and Analysis of RNA Sequencing Data [J].
Auer, Paul L. ;
Doerge, R. W. .
GENETICS, 2010, 185 (02) :405-U32
[6]   YeastMine-an integrated data warehouse for Saccharomyces cerevisiae data as a multipurpose tool-kit [J].
Balakrishnan, Rama ;
Park, Julie ;
Karra, Kalpana ;
Hitz, Benjamin C. ;
Binkley, Gail ;
Hong, Eurie L. ;
Sullivan, Julie ;
Micklem, Gos ;
Cherry, J. Michael .
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2012,
[7]   The Inferelator:: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo [J].
Bonneau, Richard ;
Reiss, David J. ;
Shannon, Paul ;
Facciotti, Marc ;
Hood, Leroy ;
Baliga, Nitin S. ;
Thorsson, Vesteinn .
GENOME BIOLOGY, 2006, 7 (05)
[8]   High-resolution mapping and characterization of open chromatin across the genome [J].
Boyle, Alan P. ;
Davis, Sean ;
Shulha, Hennady P. ;
Meltzer, Paul ;
Margulies, Elliott H. ;
Weng, Zhiping ;
Furey, Terrence S. ;
Crawford, Gregory E. .
CELL, 2008, 132 (02) :311-322
[9]  
Buenrostro JD, 2013, NAT METHODS, V10, P1213, DOI [10.1038/NMETH.2688, 10.1038/nmeth.2688]
[10]  
Caruana R, 1998, LEARNING TO LEARN, P95, DOI 10.1007/978-1-4615-5529-2_5