A Comparison of Spectral Clustering and the Walktrap Algorithm for Community Detection in Network Psychometrics

被引:15
作者
Brusco, Michael [1 ]
Steinley, Douglas [2 ]
Watts, Ashley L. [2 ]
机构
[1] Florida State Univ, Dept Business Analyt Informat Syst & Supply Chain, Tallahassee, FL 32306 USA
[2] Univ Missouri, Dept Psychol Sci, Columbia, MO 65211 USA
基金
美国国家卫生研究院;
关键词
network psychometrics; spectral clustering; walktrap algorithm; K-means clustering; K-median clustering; ISING-MODEL SELECTION; LOCAL OPTIMA; BINARY DATA; NUMBER; COMORBIDITY; DEPRESSION;
D O I
10.1037/met0000509
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Spectral clustering is a well-known method for clustering the vertices of an undirected network. Although its use in network psychometrics has been limited, spectral clustering has a close relationship to the commonly used walktrap algorithm. In this article, we report results from simulation experiments designed to evaluate the ability of spectral clustering and the walktrap algorithm to recover underlying cluster (or community) structure in networks. The salient findings include: (a) the recovery performance of the walktrap algorithm can be improved by using K-means clustering instead of hierarchical clustering; (b) K-means and K-median clustering led to comparable recovery performance when used to cluster vertices based on the eigenvectors of Laplacian matrices in spectral clustering; (c) spectral clustering using the unnormalized Laplacian matrix generally yielded inferior cluster recovery in comparison to the other methods; (d) when the correct number of clusters was provided for the methods, spectral clustering using the normalized Laplacian matrix led to better recovery than the walktrap algorithm; and (e) when the correct number of clusters was not provided, the walktrap algorithm using the Q(w) modularity index was better than spectral clustering using the eigengap heuristic at determining the appropriate number of clusters. Overall, both the walktrap algorithm and spectral clustering of the normalized Laplacian matrix are effective for partitioning the vertices of undirected networks, with the latter performing better in most instances. Translational Abstract The walktrap algorithm has often been used in network psychometrics to identify communities (clusters) of variables. We show that the walktrap algorithm can often be improved by using multistart K-means to group the variables instead of Ward's hierarchical method. We also propose spectral clustering as an alternative approach to community detection and compare it to the walktrap algorithm. Spectral clustering provided better recovery of known underlying cluster structure when the correct number of clusters is provided for the methods. However, when the correct number of clusters was unknown, the walktrap algorithm using modularity to choose the number of clusters outperformed spectral clustering using the eigengap heuristic to choose the number of clusters. Therefore, it seems advantageous to consider using the walktrap algorithm to choose the number of clusters and spectral clustering to actually obtain the communities of items for that selected number of clusters.
引用
收藏
页码:704 / 722
页数:19
相关论文
共 84 条
[1]  
[Anonymous], 2020, MATLAB
[2]  
[Anonymous], 1991, GRAPH THEORY COMBINA
[3]   High-dimensional Ising model selection with Bayesian information criteria [J].
Barber, Rina Foygel ;
Drton, Mathias .
ELECTRONIC JOURNAL OF STATISTICS, 2015, 9 (01) :567-607
[4]   The role of stabilizing and communicating symptoms given overlapping communities in psychopathology networks [J].
Blanken, Tessa F. ;
Deserno, Marie K. ;
Dalege, Jonas ;
Borsboom, Denny ;
Blanken, Peter ;
Kerkhof, Gerard A. ;
Cramer, Angelique O. J. .
SCIENTIFIC REPORTS, 2018, 8
[5]  
Borg I., 2018, Applied multidimensional scaling and unfolding, V2nd, DOI [10.1007/978-3-319-73471-2, DOI 10.1007/978-3-319-73471-2]
[6]   Network analysis of empathy items from the interpersonal reactivity index in 1973 young adults [J].
Briganti, Giovanni ;
Kempenaers, Chantal ;
Braun, Stephanie ;
Fried, Eiko I. ;
Linkowski, Paul .
PSYCHIATRY RESEARCH, 2018, 265 :87-92
[7]   Optimal partitioning of a data set based on the p-median model [J].
Brusco, Michael J. ;
Koehn, Hans-Friedrich .
PSYCHOMETRIKA, 2008, 73 (01) :89-105
[8]   A repetitive branch-and-bound procedure for minimum within-cluster sums of squares partitioning [J].
Brusco, Michael J. .
PSYCHOMETRIKA, 2006, 71 (02) :347-363
[9]   Disentangling relationships in symptom networks using matrix permutation methods [J].
Brusco, Michael J. ;
Steinley, Douglas ;
Watts, Ashley L. .
PSYCHOMETRIKA, 2022, 87 (01) :133-155
[10]   K-medoids inverse regression [J].
Brusco, Michael J. ;
Steinley, Douglas ;
Stevens, Jordan .
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2019, 48 (20) :4999-5011