Preference-Based Offline Evaluation

被引:7
作者
Clarke, Charles L. A. [1 ]
Diaz, Fernando [2 ]
Arabzadeh, Negar [1 ]
机构
[1] Univ Waterloo, Waterloo, ON, Canada
[2] Google, Montreal, PQ, Canada
来源
PROCEEDINGS OF THE SIXTEENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2023, VOL 1 | 2023年
关键词
offline evaluation; preferences; search; tutorial;
D O I
10.1145/3539597.3572725
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A core step in production model research and development involves the offline evaluation of a system before production deployment. Traditional offline evaluation of search, recommender, and other systems involves gathering item relevance labels from human editors. These labels can then be used to assess system performance using offline evaluation metrics. Unfortunately, this approach does not work when evaluating highly-effective ranking systems, such as those emerging from the advances in machine learning. Recent work demonstrates that moving away from pointwise item and metric evaluation can be a more effective approach to the offline evaluation of systems. This tutorial, intended for both researchers and practitioners, reviews early work in preference-based evaluation and covers recent developments in detail.
引用
收藏
页码:1248 / 1251
页数:4
相关论文
共 87 条
[61]  
Sakai Tetsuya, 2007, 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P71, DOI 10.1145/1277741.1277756
[62]   Which Diversity Evaluation Measures Are "Good"? [J].
Sakai, Tetsuya ;
Zeng, Zhaohao .
PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19), 2019, :595-604
[63]  
Sakai T, 2013, SIGIR'13: THE PROCEEDINGS OF THE 36TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH & DEVELOPMENT IN INFORMATION RETRIEVAL, P473
[64]  
Sakai T, 2011, PROCEEDINGS OF THE 34TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR'11), P1043
[65]  
Sakai Tetsuya, 2020, 43 ANN INT ACM SIGIR
[66]  
Sanderson Mark, 2010, 33 INT ACM SIGIR C R
[67]   Fairness of Exposure in Rankings [J].
Singh, Ashudeep ;
Joachims, Thorsten .
KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, :2219-2228
[68]  
Smucker Mark, 2021, 44 INT ACM SIGIR C R
[69]  
Smucker MD, 2012, SIGIR 2012: PROCEEDINGS OF THE 35TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, P95
[70]  
Soboroff I., 2021, OVERVIEW TREC 2021