Improving recommendation quality through outlier removal

被引:3
作者
Xu, Yuan-Yuan [1 ]
Gu, Shen-Ming [2 ]
Min, Fan [1 ]
机构
[1] Southwest Petr Univ, Sch Comp Sci, Chengdu 610500, Peoples R China
[2] Zhejiang Ocean Univ, Key Lab Oceanog Big Data Min & Applicat Zhejiang, Zhoushan 316022, Peoples R China
关键词
Matrix factorization; Outlier Removal; Recommender systems; SYSTEMS; FALLIBILITY; REFLEXIVITY; MODELS; NOISY; IMAGE;
D O I
10.1007/s13042-021-01490-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Rating data collected by recommendation systems contain noise caused by human uncertainty and malicious attacks. Existing outlier removal approaches usually aim at detecting noise inserted into ground-truth ratings. However, in real applications, the ground-truth of the training data are unavailable, or even unimportant for the prediction task. In this paper, we propose an efficient and effective outlier removal algorithm to improve the quality of the training data. The noise is modeled by the mixture of Gaussian distribution, which can approximate any continuous distribution. First, we employ the expectation-maximization algorithm to calculate the low-rank matrices, whose product forms the recovered ratings. Second, we compare the original and recovered ratings to solicit suspected outliers. This process is repeated a number of times, and ratings that are suspected enough times will be treated as outliers. To validate the effectiveness of our algorithm, we compared the prediction quality of four popular recommendation algorithms. Results showed that several measures on the algorithms were improved with the new training data.
引用
收藏
页码:1819 / 1832
页数:14
相关论文
共 59 条
  • [1] Adeniyi DA., 2016, APPL COMPUT INFORM, V12, P90, DOI [10.1016/j.aci.2014.10.001, DOI 10.1016/J.ACI.2014.10.001]
  • [2] Aggarwal C. C., 2016, RECOMMENDER SYSTEMS, DOI [10.1007/978-3-319-29659-3, DOI 10.1007/978-3-319-29659-3]
  • [3] [Anonymous], 2018, INT J MACH LEARN CYB
  • [4] Reflexivity, complexity, and the nature of social science
    Beinhocker, Eric
    [J]. JOURNAL OF ECONOMIC METHODOLOGY, 2013, 20 (04) : 330 - 342
  • [5] Low-rank Matrix Factorization under General Mixture Noise Distributions
    Cao, Xiangyong
    Chen, Yang
    Zhao, Qian
    Meng, Deyu
    Wang, Yao
    Wang, Dong
    Xu, Zongben
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1493 - 1501
  • [6] Attack Detection in Recommender Systems Using Subspace Outlier Detection Algorithm
    Chakraborty, Partha Sarathi
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMMUNICATION, DEVICES AND COMPUTING, 2020, 602 : 679 - 685
  • [7] Supervised kernel nonnegative matrix factorization for face recognition
    Chen, Wen-Sheng
    Zhao, Yang
    Pan, Binbin
    Chen, Bo
    [J]. NEUROCOMPUTING, 2016, 205 : 165 - 181
  • [8] Robust Tensor Factorization with Unknown Noise
    Chen, Xiai
    Han, Zhi
    Wang, Yao
    Zhao, Qian
    Meng, Deyu
    Tang, Yandon
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 5213 - 5221
  • [9] Davoudi A, 2017, IEEE INT CONF BIG DA, P2714, DOI 10.1109/BigData.2017.8258235
  • [10] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
    DEMPSTER, AP
    LAIRD, NM
    RUBIN, DB
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38