Model Selection and Hypothesis Testing for Large-Scale Network Models with Overlapping Groups

被引:73
|
作者
Peixoto, Tiago P. [1 ]
机构
[1] Univ Bremen, Inst Theoret Phys, D-28359 Bremen, Germany
来源
PHYSICAL REVIEW X | 2015年 / 5卷 / 01期
关键词
COMMUNITY STRUCTURE; COMPLEX NETWORKS; INTERACTOME; BLOCKMODELS; DYNAMICS;
D O I
10.1103/PhysRevX.5.011033
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
The effort to understand network systems in increasing detail has resulted in a diversity of methods designed to extract their large-scale structure from data. Unfortunately, many of these methods yield diverging descriptions of the same network, making both the comparison and understanding of their results a difficult challenge. A possible solution to this outstanding issue is to shift the focus away from ad hoc methods and move towards more principled approaches based on statistical inference of generative models. As a result, we face instead the more well-defined task of selecting between competing generative processes, which can be done under a unified probabilistic framework. Here, we consider the comparison between a variety of generative models including features such as degree correction, where nodes with arbitrary degrees can belong to the same group, and community overlap, where nodes are allowed to belong to more than one group. Because such model variants possess an increasing number of parameters, they become prone to overfitting. In this work, we present a method of model selection based on the minimum description length criterion and posterior odds ratios that is capable of fully accounting for the increased degrees of freedom of the larger models and selects the best one according to the statistical evidence available in the data. In applying this method to many empirical unweighted networks from different fields, we observe that community overlap is very often not supported by statistical evidence and is selected as a better model only for a minority of them. On the other hand, we find that degree correction tends to be almost universally favored by the available data, implying that intrinsic node proprieties (as opposed to group properties) are often an essential ingredient of network formation.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] Model selection in overlapping stochastic block models
    Latouche, Pierre
    Birmele, Etienne
    ELECTRONIC JOURNAL OF STATISTICS, 2014, 8 : 762 - 794
  • [2] Null models for multioptimized large-scale network structures
    Morel-Balbi, Sebastian
    Peixoto, Tiago P.
    PHYSICAL REVIEW E, 2020, 102 (03)
  • [3] Fuzziness and Overlapping Communities in Large-Scale Networks
    Wang, Qinna
    Fleury, Eric
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2012, 18 (04) : 457 - 486
  • [4] MODELS OF LARGE-SCALE STRUCTURE
    FRENK, CS
    PHYSICA SCRIPTA, 1991, T36 : 70 - 87
  • [5] An Overlapping Community Detection Algorithm Based on Triangle Reduction Weighted for Large-Scale Complex Network
    Zhang, Hanning
    Dong, Bo
    Feng, Boqin
    Wu, Haiyu
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2020, PT I, 2020, 12452 : 627 - 644
  • [6] A metapopulation model for chikungunya including populations mobility on a large-scale network
    Moulay, Djamila
    Pigne, Yoann
    JOURNAL OF THEORETICAL BIOLOGY, 2013, 318 : 129 - 139
  • [7] On Scale-Free Prior Distributions and Their Applicability in Large-Scale Network Inference with Gaussian Graphical Models
    Sheridan, Paul
    Kamimura, Takeshi
    Shimodaira, Hidetoshi
    COMPLEX SCIENCES, PT 1, 2009, 4 : 110 - 117
  • [8] Uncovering complex overlapping pattern of communities in large-scale social networks
    Elvis H.W. Xu
    Pak Ming Hui
    Applied Network Science, 4
  • [9] Uncovering complex overlapping pattern of communities in large-scale social networks
    Xu, Elvis H. W.
    Hui, Pak Ming
    APPLIED NETWORK SCIENCE, 2019, 4 (01)
  • [10] Computational Models of Large-Scale Genome Architecture
    Rosa, Angelo
    Zimmer, Christophe
    NEW MODELS OF THE CELL NUCLEUS: CROWDING, ENTROPIC FORCES, PHASE SEPARATION, AND FRACTALS, 2014, 307 : 275 - 349