A review of cluster analysis techniques and their uses in library and information science research: k-means and k-medoids clustering

被引：28

作者：

Lund, Brady ^{[1
]}

Ma, Jinxuan ^{[1
]}

机构：

[1] Emporia State Univ, Emporia, KS 66801 USA

来源：

PERFORMANCE MEASUREMENT AND METRICS | 2021年 / 22卷 / 03期

关键词：

Clustering; Library and information science; Research methods; Cluster analysis; Data analysis; K-means;

D O I：

10.1108/PMM-05-2021-0026

中图分类号：

G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];

学科分类号：

1205 ; 120501 ;

摘要：

Purpose - This literature review explores the definitions and characteristics of cluster analysis, a machine-learning technique that is frequently implemented to identify groupings in big datasets and its applicability to library and information science (LIS) research. This overview is intended for researchers who are interested in expanding their data analysis repertory to include cluster analysis, rather than for existing experts in this area. Design/methodology/approach - A review of LIS articles included in the Library and Information Source (EBSCO) database that employ cluster analysis is performed. An overview of cluster analysis in general (how it works from a statistical standpoint, and how it can be performed by researchers), the most popular cluster analysis techniques and the uses of cluster analysis in LIS is presented. Findings - The number of LIS studies that employ a cluster analytic approach has grown from about 5 per year in the early 2000s to an average of 35 studies per year in the mid- and late-2010s. The journal Scientometrics has the most articles published within LIS that use cluster analysis (102 studies). Scientometrics is the most common subject area to employ a cluster analytic approach (152 studies). The findings of this review indicate that cluster analysis could make LIS research more accessible by providing an innovative and insightful process of knowledge discovery. Originality/value - This review is the first to present cluster analysis as an accessible data analysis approach, specifically from an LIS perspective.

引用

页码：161 / 173

页数：13

共 50 条

[41] Effect of cluster size distribution on clustering: a comparative study of k-means and fuzzy c-means clustering
Zhou, Kaile
Yang, Shanlin
PATTERN ANALYSIS AND APPLICATIONS, 2020, 23 (01) : 455 - 466
[42] Improved research to k-means initial cluster centers
Zhang Min
Duan Kai-fei
2015 NINTH INTERNATIONAL CONFERENCE ON FRONTIER OF COMPUTER SCIENCE AND TECHNOLOGY FCST 2015, 2015, : 348 - 352
[43] Clustering Research on Ship Fault Phenomena Based on K-means Algorithm
Wei, Guo-dong
Luo, Zhong
Yu, Xiang
PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 4412 - 4415
[44] A Data Science and Engineering Solution for Fast k-Means Clustering of Big Data
Dierckens, Karl E.
Harrison, Adrian B.
Leung, Carson K.
Pind, Adrienne V.
2017 16TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS / 11TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING / 14TH IEEE INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS, 2017, : 925 - 932
[45] On the Added Value of Bootstrap Analysis for K-Means Clustering
Joeri Hofmans
Eva Ceulemans
Douglas Steinley
Iven Van Mechelen
Journal of Classification, 2015, 32 : 268 - 284
[46] Analysis of K-means clustering for Human Capital Trends
Sharma, Gamini
Sharma, Manish Kumar
Sharma, Dakshata
PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON ICT IN BUSINESS INDUSTRY & GOVERNMENT (ICTBIG), 2016,
[47] A multiple k-means cluster ensemble framework for clustering citation trajectories
Chakraborty, Joyita
Pradhan, Dinesh K.
Nandi, Subrata
JOURNAL OF INFORMETRICS, 2024, 18 (02)
[48] Seeding Cluster centers of K-means Clustering through Median projection
Suresh, L.
Simha, Jay B.
Velur, Rajappa
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPLEX, INTELLIGENT AND SOFTWARE INTENSIVE SYSTEMS (CISIS 2010), 2010, : 217 - 222
[49] Clique partitioning for clustering:: A comparison with K-means and latent class analysis
Wang, Haibo
Obremski, Tom
Alidaee, Bahram
Kochenberger, Gary
COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2008, 37 (01) : 1 - 13
[50] Multimorbidity patterns with K-means nonhierarchical cluster analysis
Concepción Violán
Albert Roso-Llorach
Quintí Foguet-Boreu
Marina Guisado-Clavero
Mariona Pons-Vigués
Enriqueta Pujol-Ribera
Jose M. Valderas
BMC Family Practice, 19

← 1 2 3 4 5 →