A review of cluster analysis techniques and their uses in library and information science research: k-means and k-medoids clustering

被引:28
作者
Lund, Brady [1 ]
Ma, Jinxuan [1 ]
机构
[1] Emporia State Univ, Emporia, KS 66801 USA
关键词
Clustering; Library and information science; Research methods; Cluster analysis; Data analysis; K-means;
D O I
10.1108/PMM-05-2021-0026
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
Purpose - This literature review explores the definitions and characteristics of cluster analysis, a machine-learning technique that is frequently implemented to identify groupings in big datasets and its applicability to library and information science (LIS) research. This overview is intended for researchers who are interested in expanding their data analysis repertory to include cluster analysis, rather than for existing experts in this area. Design/methodology/approach - A review of LIS articles included in the Library and Information Source (EBSCO) database that employ cluster analysis is performed. An overview of cluster analysis in general (how it works from a statistical standpoint, and how it can be performed by researchers), the most popular cluster analysis techniques and the uses of cluster analysis in LIS is presented. Findings - The number of LIS studies that employ a cluster analytic approach has grown from about 5 per year in the early 2000s to an average of 35 studies per year in the mid- and late-2010s. The journal Scientometrics has the most articles published within LIS that use cluster analysis (102 studies). Scientometrics is the most common subject area to employ a cluster analytic approach (152 studies). The findings of this review indicate that cluster analysis could make LIS research more accessible by providing an innovative and insightful process of knowledge discovery. Originality/value - This review is the first to present cluster analysis as an accessible data analysis approach, specifically from an LIS perspective.
引用
收藏
页码:161 / 173
页数:13
相关论文
共 50 条
  • [31] New Approaches to Normalization Techniques to Enhance K-Means Clustering Algorithm
    Dalatu, P., I
    Midi, H.
    MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES, 2020, 14 (01): : 41 - 62
  • [32] Outlier Detection using Clustering Techniques - K-means and K-median
    Angelin, B.
    Geetha, A.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS 2020), 2020, : 373 - 378
  • [33] Probabilistic reduced K-means cluster analysis
    Lee, Seunghoon
    Song, Juwon
    KOREAN JOURNAL OF APPLIED STATISTICS, 2021, 34 (06) : 905 - 922
  • [34] Research on selecting initial points for k-means clustering
    Wang, Shou-Qiang
    Zhu, Da-Ming
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 2673 - 2677
  • [35] Analysis and Visualization of Twitter Data using k-means Clustering
    Garg, Neha
    Rani, Rinkle
    2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), 2017, : 670 - 675
  • [36] A new method for selecting initial cluster centers in k-means clustering algorithm
    Zhang, Guoying
    Sha, Yun
    He, Yuanjiao
    2008 PROCEEDINGS OF INFORMATION TECHNOLOGY AND ENVIRONMENTAL SYSTEM SCIENCES: ITESS 2008, VOL 2, 2008, : 879 - 883
  • [37] K-means clustering for SAT-AIS data analysis
    Marta Mieczyńska
    Ireneusz Czarnowski
    WMU Journal of Maritime Affairs, 2021, 20 : 377 - 400
  • [38] Neighborhood density method for selecting initial cluster centers in k-means clustering
    Ye, Yunming
    Huang, Joshua Zhexue
    Chen, Xiaojun
    Zhou, Shuigeng
    Williams, Graham
    Xu, Xiaofei
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2006, 3918 : 189 - 198
  • [39] An Improved Swarm Based Hybrid K-Means Clustering for Optimal Cluster Centers
    Nayak, Janmenjoy
    Naik, Bighnaraj
    Kanungo, D. P.
    Behera, H. S.
    INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS, VOL 1, 2015, 339 : 545 - 553
  • [40] Social Media Analysis using Optimized K-Means Clustering
    Alsayat, Ahmed
    El-Sayed, Hoda
    2016 IEEE/ACIS 14TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING RESEARCH, MANAGEMENT AND APPLICATIONS (SERA), 2016, : 61 - 66