Applicability of K-medoids and K-means algorithms for segmenting students based on their scholastic performance

被引:5
作者
Badhera, Usha [1 ]
Verma, Apoorva [2 ]
Nahar, Pooja [2 ]
机构
[1] Jaipuria Inst Management, Dept Business Analyt, Jaipur 302033, Rajasthan, India
[2] SS Jain Subodh PG Coll, Dept Comp Sci, Jaipur 302005, Rajasthan, India
关键词
Educational data mining; Clustering; K-means; K-medoids; Academic performance; Outliers; CLUSTERING ALGORITHMS;
D O I
10.1080/09720510.2022.2130566
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this paper literature was surveyed to find popular clustering techniques used by researchers in recent times to predict academic performance. We obtained a trend that the K-means algorithm is particularly popular among researchers because of its simplicity and scalability, and in other studies K-medoids algorithm was selected as it is less affected by outliers. On the basis of these observations these two clustering algorithms were implemented in Python, on student dataset of undergraduate students from a higher education institute. Two different clusters were obtained which segment students based on their academic performances in the previous two exams. The clusters obtained by have high accuracy score and K-medoids cluster centroids have taken exact values of marks obtained by students whereas K-means centroid value is a round off. The K-means clustering is also affected by the presence of outliers in the student dataset.
引用
收藏
页码:1621 / 1632
页数:12
相关论文
共 32 条
  • [1] Analysis of K-Means and K-Medoids Algorithm For Big Data
    Arora, Preeti
    Deepali
    Varshney, Shipra
    [J]. 1ST INTERNATIONAL CONFERENCE ON INFORMATION SECURITY & PRIVACY 2015, 2016, 78 : 507 - 512
  • [2] Analyzing undergraduate students' performance using educational data mining
    Asif, Raheela
    Merceron, Agathe
    Ali, Syed Abbas
    Haider, Najmi Ghani
    [J]. COMPUTERS & EDUCATION, 2017, 113 : 177 - 194
  • [3] Educational Data Mining versus Learning Analytics: A Review of Publications From 2015 to 2019
    Baek, Clare
    Doleck, Tenzin
    [J]. INTERACTIVE LEARNING ENVIRONMENTS, 2023, 31 (06) : 3828 - 3850
  • [4] Baker RSJd., 2009, J. Educ. Data Min., V1, P3, DOI [DOI 10.5281/ZENODO.3554657, 10.5281/zenodo.3554657]
  • [5] Bauckhage Christian, 2015, RESEARCHGATE NET FEB
  • [6] Berkhin P, 2006, GROUPING MULTIDIMENSIONAL DATA: RECENT ADVANCES IN CLUSTERING, P25
  • [7] Dalatu P. I., 2016, GLOBAL J PURE APPL M, V12, P4405
  • [8] DeFreitas K, 2015, IADIS-INT J COMPUT S, V10, P65
  • [9] A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis
    Fahad, Adil
    Alshatri, Najlaa
    Tari, Zahir
    Alamri, Abdullah
    Khalil, Ibrahim
    Zomaya, Albert Y.
    Foufou, Sebti
    Bouras, Abdelaziz
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2014, 2 (03) : 267 - 279
  • [10] Govindasamy K., 2018, Int. J. Pure Appl. Math., V119, P309