A Correlation Based Recommendation System for Large Data Sets

被引:0
|
作者
Divya Pandove
Avleen Malhi
机构
[1] Glover Park Group,
[2] Aalto University,undefined
[3] Bournemouth University,undefined
来源
Journal of Grid Computing | 2021年 / 19卷
关键词
Correlation clustering; Recommendation system model; RBACC; LGBACC;
D O I
暂无
中图分类号
学科分类号
摘要
Correlation determination brings out relationships in data that had not been seen before and it is imperative to successfully use the power of correlations for data mining. In this paper, we have used the concepts of correlations to cluster data, and merged it with recommendation algorithms. We have proposed two correlation clustering algorithms (RBACC and LGBACC), that are based on finding Spearman’s rank correlation coefficient among data points, and using dimensionality reduction approach (PCA) along with graph theory respectively, to produce high quality hierarchical clusters. Both these algorithms have been tested on real life data (New York yellow cabs dataset taken from http://www.nyc.gov), using distributed and parallel computing (Spark and R). They are found to be scalable and perform better than the existing hierarchical clustering algorithms. These two approaches have been used to replace similarity measures in recommendation algorithms and generate a correlation clustering based recommendation system model. We have combined the power of correlation analysis with that of prediction analysis to propose a better recommendation system. It is found that this model makes better quality recommendations as compared to the random recommendation model. This model has been validated using a real time, large data set (MovieLens dataset, taken from http://grouplens.org/datasets/movielens/latest). The results show that combining correlated points with the predictive power of recommendation algorithms, produce better quality recommendations which are faster to compute. LGBACC has approximately 25% better prediction capability but at the same time takes significantly more prediction time compared to RBACC.
引用
收藏
相关论文
共 50 条
  • [1] A Correlation Based Recommendation System for Large Data Sets
    Pandove, Divya
    Malhi, Avleen
    JOURNAL OF GRID COMPUTING, 2021, 19 (04)
  • [2] Recommendation system based on the clustering of frequent sets
    Toma, Andrei
    Constantinescu, Radu
    Nastase, Floarea
    PROCEEDINGS OF THE 8TH WSEAS INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, KNOWLEDGE ENGINEERING AND DATA BASES, 2009, : 343 - +
  • [3] Multivariate Correlation Entropy and Law Discovery in Large Data Sets
    Wang, Jianji
    Zheng, Nanning
    Chen, Badong
    Chen, Pei
    Chen, Shitao
    Liu, Ziyi
    Wang, Fei-Yue
    Xi, Bao
    IEEE INTELLIGENT SYSTEMS, 2018, 33 (05) : 47 - 54
  • [4] Decentralized Canonical Correlation Analysis for Large Data-sets
    Arad, Asaf
    Peleg, Shahaf Yaron
    Amar, Alon
    2021 IEEE 24TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2021, : 32 - 39
  • [5] AN ADAPTIVE HESITANT FUZZY SETS BASED GROUP RECOMMENDATION SYSTEM
    Jayaraman, Rohith
    Subramaniyaswamy, V.
    Ravi, Logesh
    MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2020, : 124 - 139
  • [6] Validating models based on large data sets
    Clark, RD
    Sprous, DG
    Leonard, JM
    RATIONAL APPROACHES TO DRUG DESIGN, 2001, : 475 - 485
  • [7] POL - INTERACTIVE SYSTEM TO ANALYZE LARGE DATA SETS
    ARMENISE, N
    ZITO, G
    SILVESTRI, A
    LEFONS, E
    PAZIENZA, MT
    TANGORRA, F
    COMPUTER PHYSICS COMMUNICATIONS, 1979, 16 (02) : 147 - 157
  • [8] Content Recommendation System Based on Global Data
    Lin, Zhiyong
    2015 INTERNATIONAL CONFERENCE ON EDUCATION RESEARCH AND REFORM (ERR 2015), PT 1, 2015, 8 : 488 - 492
  • [9] Research on Recommendation System Based on Massive Data
    He, Fengqin
    2018 4TH INTERNATIONAL CONFERENCE ON EDUCATION, MANAGEMENT AND INFORMATION TECHNOLOGY (ICEMIT 2018), 2018, : 411 - 414
  • [10] Zynq-based System for Extracting Sorted Subsets from Large Data Sets
    Sklyarov, V.
    Skliarova, I.
    Rjabov, A.
    Sudnitson, A.
    INFORMACIJE MIDEM-JOURNAL OF MICROELECTRONICS ELECTRONIC COMPONENTS AND MATERIALS, 2015, 45 (02): : 142 - 152