Sapling Similarity: A performing and interpretable memory-based tool for recommendation

被引:7
作者
Albora, Giambattista [1 ,2 ]
Mori, Lavinia Rossi [1 ,3 ,5 ]
Zaccaria, Andrea [1 ,4 ]
机构
[1] Enrico Fermi Res Ctr, Rome, Italy
[2] Univ Roma La Sapienza, Phys Dept, Rome, Italy
[3] Tor Vergata Univ, Phys Dept, Rome, Italy
[4] UOS Sapienza, Ist Sistemi Complessi CNR, Rome, Italy
[5] Sony Comp Sci Labs Rome, Joint Initiat CREF Sony, Rome, Italy
关键词
Recommender system; Collaborative filtering; Bipartite networks; Similarity; MATRIX FACTORIZATION; SYSTEMS; DYNAMICS;
D O I
10.1016/j.knosys.2023.110659
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many bipartite networks describe systems where an edge represents a relation between a user and an item. Measuring the similarity between either users or items is the basis of memory-based collaborative filtering, a widely used method to build a recommender system with the purpose of proposing items to users. When the edges of the network are unweighted, the popular common neighbors-based approaches, allowing only positive similarity values, neglect the possibility and the effect of two users (or two items) being very dissimilar. Moreover, they underperform with respect to model-based (machine learning) approaches, although providing a higher interpretability. Inspired by the functioning of Decision Trees, we propose a method to compute similarity that allows also negative values, the Sapling Similarity. The key idea is to look at how the information that a user is connected to an item influences our prior estimation of the probability that another user is connected to the same item: if it is reduced, then the similarity between the two users will be negative, otherwise it will be positive. We show that, when used to build memory-based collaborative filtering, Sapling Similarity provides better recommendations than existing similarity metrics. Then we compare the Sapling Similarity Collaborative Filtering (SSCF, an hybrid of the itembased and the user-based) with state-of-the-art models using standard datasets. Even if SSCF depends on only one straightforward hyperparameter, it has comparable or higher recommending accuracy, and outperforms all other models on the Amazon-Book dataset, while retaining the high explainability of memory-based approaches.
引用
收藏
页数:11
相关论文
共 66 条
[1]   Friends and neighbors on the Web [J].
Adamic, LA ;
Adar, E .
SOCIAL NETWORKS, 2003, 25 (03) :211-230
[2]   Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions [J].
Adomavicius, G ;
Tuzhilin, A .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (06) :734-749
[3]  
Aggarwal Charu C, 2016, RECOMMENDER SYSTEMS
[4]   Product progression: a machine learning approach to forecasting industrial upgrading [J].
Albora, Giambattista ;
Pietronero, Luciano ;
Tacchella, Andrea ;
Zaccaria, Andrea .
SCIENTIFIC REPORTS, 2023, 13 (01)
[5]   Machine Learning to Assess Relatedness: The Advantage of Using Firm-Level Data [J].
Albora, Giambattista ;
Zaccaria, Andrea .
COMPLEXITY, 2022, 2022
[6]  
[Anonymous], 1983, Introduction to Modern Information Retrieval
[7]   TRADE LIBERALISATION AND REVEALED COMPARATIVE ADVANTAGE [J].
BALASSA, B .
MANCHESTER SCHOOL OF ECONOMIC AND SOCIAL STUDIES, 1965, 33 (02) :99-123
[8]  
Bass JIF, 2013, NAT METHODS, V10, P1169, DOI [10.1038/nmeth.2728, 10.1038/NMETH.2728]
[9]   General scores for accessibility and inequality measures in urban areas [J].
Biazzo, Indaco ;
Monechi, Bernardo ;
Loreto, Vittorio .
ROYAL SOCIETY OPEN SCIENCE, 2019, 6 (08)
[10]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32