Subset ranking using regression

被引:116
作者
Cossock, David [1 ]
Zhang, Tong
机构
[1] Yahoo Inc, Santa Clara, CA USA
[2] Yahoo Inc, New York, NY USA
来源
LEARNING THEORY, PROCEEDINGS | 2006年 / 4005卷
关键词
D O I
10.1007/11776420_44
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the subset ranking problem, motivated by its important application in web-search. In this context, we consider the standard DCG criterion (discounted cumulated gain) that measures the quality of items near the top of the rank-list. Similar to error minimization for binary classification, the DCG criterion leads to a non-convex optimization problem that can be NP-hard. Therefore a computationally more tractable approach is needed. We present bounds that relate the approximate optimization of DCG to the approximate minimization of certain regression errors. These bounds justify the use of convex learning formulations for solving the subset ranking problem. The resulting estimation methods are not conventional, in that we focus on the estimation quality in the top-portion of the rank-list. We further investigate the generalization ability of these formulations. Under appropriate conditions, the consistency of the estimation schemes with respect to the DCG metric can be derived.
引用
收藏
页码:605 / 619
页数:15
相关论文
共 17 条
  • [1] Agarwal S, 2005, J MACH LEARN RES, V6, P393
  • [2] [Anonymous], 2003, J MACH LEARN RES
  • [3] BARTLETT P, 2003, IN PRESS JASA
  • [4] Burges C., 2005, ICML 05
  • [5] CLEMENCON S, 2005, COLT 05
  • [6] COSSOCK D, 2003, Patent No. 20040215606
  • [7] Greedy function approximation: A gradient boosting machine
    Friedman, JH
    [J]. ANNALS OF STATISTICS, 2001, 29 (05) : 1189 - 1232
  • [8] HANLEY J, 1982, RADIOLOGY, P29
  • [9] Herbrich R, 2000, ADV NEUR IN, P115
  • [10] JARVELIN K, 2000, SIGIR 00, P41