Unsupervised Qualitative Scoring for Binary Item Features

被引:0
作者
Koji Ichikawa
Hiroshi Tamano
机构
[1] NEC Corporation,
来源
Data Science and Engineering | 2020年 / 5卷
关键词
Label enhancement; Unsupervised learning; Collaborative filtering; Topic model;
D O I
暂无
中图分类号
学科分类号
摘要
Binary features, such as categories, keywords, or tags, are widely used to describe product properties. However, these features are incomplete in that they do not contain several aspects of numerical information. The qualitative score of tags is widely used to describe which product is better in terms of the given property. For example, in a restaurant navigation site, properties such as mood, dishes, and location are given in the form of numerical values, representing the goodness of each aspect. In this paper, we propose a novel approach to estimate the qualitative score from the binary features of products. Based on a natural assumption that an item with a better property is more popular among users who prefer that property, in short, “experts know best,” we introduce both discriminative and generative models with which user preferences and item qualitative scores are inferred from user--item interactions. We constrain the space of the item qualitative score by item binary features so that the score of each item and tag can only have nonzero values when the item has the corresponding tag. This approach contributes to resolving the following difficulties: (1) no supervised data for the score estimation, (2) implicit user purpose, and (3) irrelevant tag contamination. We evaluate our models by using two artificial datasets and two real-world datasets of movie and book ratings. In the experiment, we evaluate the performances of our model under sparse transaction and noisy tag settings by using two artificial datasets. We also evaluate our models’ resolution for irrelevant tags using the real-world dataset of movie ratings and observe that our models outperform a baseline model. Finally, tag rankings obtained from the real-world datasets are compared with a baseline model.
引用
收藏
页码:317 / 330
页数:13
相关论文
共 33 条
[1]  
Geng X(2016)Label distribution learning IEEE Trans Knowl Data Eng 28 1734-1748
[2]  
Goldberg D(1992)Using collaborative filtering to weave an information tapestry Commun ACM 35 61-70
[3]  
Nichols DA(2005)Collaborative filtering based on iterative principal component analysis Expert Syst Appl 28 823-830
[4]  
Oki BM(2007)Major components of the gravity recommendation system SIGKDD Explor 9 80-83
[5]  
Terry DB(1990)Indexing by latent semantic analysis J Am Soc Inf Sci 41 391-407
[6]  
Kim D(2003)Latent Dirichlet allocation J Mach Learn Res 3 993-1022
[7]  
Yum B(2007)A correlated topic model of science Ann Appl Stat 1 17-35
[8]  
Takács G(2004)Non-negative matrix factorization with sparseness constraints J Mach Learn Res 5 1457-1469
[9]  
Pilászy I(2013)Non-negative matrix factorization revisited: uniqueness and algorithm for symmetric decomposition IEEE Trans Signal Process 62 211-224
[10]  
Németh B(2009)On the complexity of nonnegative matrix factorization SIAM J Optim 20 1364-1377