SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms

被引:212
作者
Van den Bulcke, T
Van Leemput, K
Naudts, B
van Remortel, P
Ma, HW
Verschoren, A
De Moor, B
Marchal, K
机构
[1] Katholieke Univ Leuven, SCD, ESAT, B-3001 Heverlee, Belgium
[2] Univ Antwerp, Dept Math & Comp Sci, ISLab, B-2020 Antwerp, Belgium
[3] German Res Ctr Biotechnol, Dept Genome Anal, D-38124 Braunschweig, Germany
[4] Katholieke Univ Leuven, Dept Microbial & Mol Syst, CMPG, B-3001 Heverlee, Belgium
[5] FWO Vlaanderen, Vlaanderen, Belgium
关键词
D O I
10.1186/1471-2105-7-43
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The development of algorithms to infer the structure of gene regulatory networks based on expression data is an important subject in bioinformatics research. Validation of these algorithms requires benchmark data sets for which the underlying network is known. Since experimental data sets of the appropriate size and design are usually not available, there is a clear need to generate well-characterized synthetic data sets that allow thorough testing of learning algorithms in a fast and reproducible manner. Results: In this paper we describe a network generator that creates synthetic transcriptional regulatory networks and produces simulated gene expression data that approximates experimental data. Network topologies are generated by selecting subnetworks from previously described regulatory networks. Interaction kinetics are modeled by equations based on Michaelis-Menten and Hill kinetics. Our results show that the statistical properties of these topologies more closely approximate those of genuine biological networks than do those of different types of random graph models. Several user-definable parameters adjust the complexity of the resulting data set with respect to the structure learning algorithms. Conclusion: This network generation technique offers a valid alternative to existing methods. The topological characteristics of the generated networks more closely resemble the characteristics of real transcriptional networks. Simulation of the network scales well to large networks. The generator models different types of biological interactions and produces biologically plausible synthetic gene expression data.
引用
收藏
页数:12
相关论文
共 29 条
[1]  
Akutsu T, 1999, PAC S BIOC, V4, P17, DOI DOI 10.1142/9789814447300_
[2]   Topology of evolving networks:: Local events and universality [J].
Albert, R ;
Barabási, AL .
PHYSICAL REVIEW LETTERS, 2000, 85 (24) :5234-5237
[3]  
Bollobás B, 2003, SIAM PROC S, P132
[4]  
Bollobas B., 2001, CAMBRIDGE STUDIES AD, V73
[5]  
D'haeseleer P, 1999, Pac Symp Biocomput, P41
[6]   Topological phase transitions of random networks [J].
Derényi, I ;
Farkas, S ;
Palla, G ;
Vicsek, T .
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2004, 334 (3-4) :583-590
[7]  
Erdos P., 1959, PUBL MATH-DEBRECEN, V6, P290, DOI [10.5486/PMD.1959.6.3-4.12, DOI 10.5486/PMD.1959.6.3-4.12]
[8]  
FERSHT A, 1985, ENZYME STRUCTURE MEC, V2
[9]   Inferring cellular networks using probabilistic graphical models [J].
Friedman, N .
SCIENCE, 2004, 303 (5659) :799-805
[10]   Topological and causal structure of the yeast transcriptional regulatory network [J].
Guelzim, N ;
Bottani, S ;
Bourgine, P ;
Képès, F .
NATURE GENETICS, 2002, 31 (01) :60-63