KuaiRec: A Fully-observed Dataset and Insights for Evaluating Recommender Systems

被引：65

作者：

Gao, Chongming ^{[1
]}

Li, Shijun ^{[1
]}

Lei, Wenqiang ^{[2
]}

Chen, Jiawei ^{[3
]}

Li, Biao ^{[4
]}

Jiang, Peng ^{[4
]}

He, Xiangnan ^{[1
]}

Mao, Jiaxin ^{[5
]}

Chua, Tat-Seng ^{[6
]}

机构：

[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China

[2] Sichuan Univ, Chengdu, Sichuan, Peoples R China

[3] Zhejiang Univ, Hangzhou, Zhejiang, Peoples R China

[4] Kuaishou Technol Co Ltd, Beijing, Peoples R China

[5] Renmin Univ China, Beijing, Peoples R China

[6] Natl Univ Singapore, Singapore, Singapore

来源：

PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022 | 2022年

基金：

中国国家自然科学基金;

关键词：

Fully-observed data; Recommendation; Evaluation; User simulation;

D O I：

10.1145/3511808.3557220

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The progress of recommender systems is hampered mainly by evaluation as it requires real-time interactions between humans and systems, which is too laborious and expensive. This issue is usually approached by utilizing the interaction history to conduct offline evaluation. However, existing datasets of user-item interactions are partially observed, leaving it unclear how and to what extent the missing interactions will influence the evaluation. To answer this question, we collect a fully-observed dataset from Kuaishou's online environment, where almost all 1, 411 users have been exposed to all 3, 327 items. To the best of our knowledge, this is the first real-world fully-observed data with millions of user-item interactions. With this unique dataset, we conduct a preliminary analysis of how the two factors - data density and exposure bias - affect the evaluation results of multi-round conversational recommendation. Our main discoveries are that the performance ranking of different methods varies with the two factors, and this effect can only be alleviated in certain cases by estimating missing interactions for user simulation. This demonstrates the necessity of the fully-observed dataset. We release the dataset and the pipeline implementation for evaluation at https://kuairec.com.

引用

页码：540 / 550

页数：11

共 78 条

[1]

Abdollahpouri Himan, 2020, INT WORKSHOP IND RE

[2] Strategyproof multi-item exchange under single-minded dichotomous preferences [J].

Aziz, Haris .

AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2020, 34 (01)

[3]

Baeza-Yates R., 1999, Modern information retrieval, V463

[4]

Chen HK, 2019, AAAI CONF ARTIF INTE, P3312

[5]

Chen JW, 2021, Arxiv, DOI [arXiv:2010.03240, 10.48550/arXiv.2010.03240]

[6]

Chen L, 2012, USER MODEL USER-ADAP, V22, P125, DOI [10.1007/s11257-011-9108-6, 10.1007/s11257-011-9115-7]

[7]

Chen QB, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P1803

[8] Q&R: A Two-Stage Approach toward Interactive Recommendation [J].

Christakopoulou, Konstantina ;

Beutel, Alex ;

Li, Rui ;

Jain, Sagar ;

Chi, Ed H. .

KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, :139-147

[9] Towards Conversational Recommender Systems [J].

Christakopoulou, Konstantina ;

Radlinski, Filip ;

Hofmann, Katja .

KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :815-824

[10]

De Myttenaere A., 2014, arXiv

← 1 2 3 4 5 6 7 8 →