Evaluating Overfit and Underfit in Models of Network Community Structure

被引:92
|
作者
Ghasemian, Amir [1 ]
Hosseinmardi, Homa [2 ]
Clauset, Aaron [3 ,4 ,5 ]
机构
[1] Univ Colorado, Dept Comp Sci, Boulder, CO 80309 USA
[2] Univ Southern Calif, Inst Informat Sci, Marina Del Rey, CA 90292 USA
[3] Univ Colorado, Boulder, CO 80309 USA
[4] Univ Colorado, Biofrontiers Inst, Boulder, CO 80305 USA
[5] Santa Fe Inst, 1399 Hyde Pk Rd, Santa Fe, NM 87501 USA
基金
美国国家科学基金会;
关键词
Task analysis; Clustering algorithms; Probabilistic logic; Bayes methods; Detection algorithms; Prediction algorithms; Partitioning algorithms; Community detection; model selection; overfitting; underfitting; link prediction; link description; STOCHASTIC BLOCKMODELS; COMPLEX NETWORKS; PREDICTION; RESOLUTION; SELECTION;
D O I
10.1109/TKDE.2019.2911585
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A common graph mining task is community detection, which seeks an unsupervised decomposition of a network into groups based on statistical regularities in network connectivity. Although many such algorithms exist, community detection's No Free Lunch theorem implies that no algorithm can be optimal across all inputs. However, little is known in practice about how different algorithms over or underfit to real networks, or how to reliably assess such behavior across algorithms. Here, we present a broad investigation of over and underfitting across 16 state-of-the-art community detection algorithms applied to a novel benchmark corpus of 572 structurally diverse real-world networks. We find that (i) algorithms vary widely in the number and composition of communities they find, given the same input; (ii) algorithms can be clustered into distinct high-level groups based on similarities of their outputs on real-world networks; (iii) algorithmic differences induce wide variation in accuracy on link-based learning tasks; and, (iv) no algorithm is always the best at such tasks across all inputs. Finally, we quantify each algorithm's overall tendency to over or underfit to network data using a theoretically principled diagnostic, and discuss the implications for future advances in community detection.
引用
收藏
页码:1722 / 1735
页数:14
相关论文
共 50 条
  • [31] Evaluating the Community Structures from Network Images Using Neural Networks
    Rahman, Md Khaledur
    Azad, Ariful
    COMPLEX NETWORKS AND THEIR APPLICATIONS VIII, VOL 1, 2020, 881 : 866 - 878
  • [32] Discovering community structure in Complex Network through Community Detection Approach
    Ismail, Suriana
    Ismail, Roslan
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM 2018), 2018,
  • [33] THE METHOD OF MOMENTS AND DEGREE DISTRIBUTIONS FOR NETWORK MODELS
    Bickel, Peter J.
    Chen, Aiyou
    Levina, Elizaveta
    ANNALS OF STATISTICS, 2011, 39 (05) : 2280 - 2301
  • [34] A Unified Framework for Community Detection and Network Representation Learning
    Tu, Cunchao
    Zeng, Xiangkai
    Wang, Hao
    Zhang, Zhengyan
    Liu, Zhiyuan
    Sun, Maosong
    Zhang, Bo
    Lin, Leyu
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2019, 31 (06) : 1051 - 1065
  • [35] Constructing and evaluating generalized models for a base-isolated structure
    Brewick, Patrick T.
    Johnson, Erik A.
    Sato, Eiji
    Sasaki, Tomohiro
    STRUCTURAL CONTROL & HEALTH MONITORING, 2018, 25 (11)
  • [36] Exponential random graph models for networks with community structure
    Fronczak, Piotr
    Fronczak, Agata
    Bujok, Maksymilian
    PHYSICAL REVIEW E, 2013, 88 (03)
  • [37] A novel network core structure extraction algorithm utilized variational autoencoder for community detection
    Fei, Rong
    Wan, Yuxin
    Hu, Bo
    Li, Aimin
    Li, Qian
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 222
  • [38] Detecting overlapping community structure: Estonian network of payments
    de la Torre, Stephanie Rendon
    Kalda, Jaan
    Kitt, Robert
    Engelbrecht, Juri
    PROCEEDINGS OF THE ESTONIAN ACADEMY OF SCIENCES, 2019, 68 (01) : 79 - 88
  • [39] The network model of urban subway networks with community structure
    Ding Yi-Min
    Ding Zhuo
    Yang Chang-Ping
    ACTA PHYSICA SINICA, 2013, 62 (09)
  • [40] Community structure extraction in directed network using triads
    Domgue, Felicite Gamgne
    Tsopze, Norbert
    Ndoundam, Rene
    INTERNATIONAL JOURNAL OF GENERAL SYSTEMS, 2020, 49 (08) : 819 - 842