Comparison of statistical methods for finding network motifs

被引:5
作者
Albieri, Vanna [2 ]
Didelez, Vanessa [1 ]
机构
[1] Univ Bristol, Sch Math, Bristol BS8 1TW, Avon, England
[2] Danish Canc Soc Res Ctr, Stat Bioinformat & Registry, Copenhagen, Denmark
关键词
Gaussian graphical models; Lasso; PC-algorithm; Shrinkage; MAXIMUM-LIKELIHOOD-ESTIMATION; DIRECTED ACYCLIC GRAPHS; SELECTION; MODEL;
D O I
10.1515/sagmb-2013-0017
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
There has been much recent interest in systems biology for investigating the structure of gene regulatory systems. Such networks are often formed of specific patterns, or network motifs, that are interesting from a biological point of view. Our aim in the present paper is to compare statistical methods specifically with regard to the question of how well they can detect such motifs. One popular approach is by network analysis with Gaussian graphical models (GGMs), which are statistical models associated with undirected graphs, where vertices of the graph represent genes and edges indicate regulatory interactions. Gene expression microarray data allow us to observe the amount of mRNA simultaneously for a large number of genes p under different experimental conditions n, where p is usually much larger than n prohibiting the use of standard methods. We therefore compare the performance of a number of procedures that have been specifically designed to address this large p-small n issue: G-Lasso estimation, Neighbourhood selection, Shrinkage estimation using empirical Bayes for model selection, and PC-algorithm. We found that all approaches performed poorly on the benchmark E. coli network. Hence we systematically studied their ability to detect specific network motifs, pairs, hubs and cascades, in extensive simulations. We conclude that all methods have difficulty detecting hubs, but the PC-algorithm is most promising.
引用
收藏
页码:403 / 422
页数:20
相关论文
共 27 条
  • [1] Network motifs: theory and experimental approaches
    Alon, Uri
    [J]. NATURE REVIEWS GENETICS, 2007, 8 (06) : 450 - 461
  • [2] Banerjee O, 2008, J MACH LEARN RES, V9, P485
  • [3] Barrett T., 2007, NUCLEIC ACIDS RES, V35, pD562
  • [4] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [5] Reverse Engineering Molecular Regulatory Networks from Microarray Data with qp-Graphs
    Castelo, Robert
    Roverato, Alberto
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2009, 16 (02) : 213 - 227
  • [6] SIMoNe: Statistical Inference for MOdular NEtworks
    Chiquet, Julien
    Smith, Alexander
    Grasseau, Gilles
    Matias, Catherine
    Ambroise, Christophe
    [J]. BIOINFORMATICS, 2009, 25 (03) : 417 - 418
  • [7] LEARNING HIGH-DIMENSIONAL DIRECTED ACYCLIC GRAPHS WITH LATENT AND SELECTION VARIABLES
    Colombo, Diego
    Maathuis, Marloes H.
    Kalisch, Markus
    Richardson, Thomas S.
    [J]. ANNALS OF STATISTICS, 2012, 40 (01) : 294 - 321
  • [8] Integrating high-throughput and computational data elucidates bacterial networks
    Covert, MW
    Knight, EM
    Reed, JL
    Herrgard, MJ
    Palsson, BO
    [J]. NATURE, 2004, 429 (6987) : 92 - 96
  • [9] DAWID AP, 1979, J ROY STAT SOC B MET, V41, P1
  • [10] Sparse inverse covariance estimation with the graphical lasso
    Friedman, Jerome
    Hastie, Trevor
    Tibshirani, Robert
    [J]. BIOSTATISTICS, 2008, 9 (03) : 432 - 441