CF-Rank: Learning to rank by classifier fusion on click-through data

被引:3
作者
Keyhanipour, Amir Hosein [1 ]
Moshiri, Behzad [1 ]
Rahgozar, Maseud [1 ]
机构
[1] Univ Tehran, Sch ECE, Ctr Excellence, Control & Intelligent Proc, Tehran, Iran
关键词
Learning to Rank; Click-through Data; Classifier Fusion; USERS PREFERENCES; AGGREGATION;
D O I
10.1016/j.eswa.2015.07.014
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ranking as a key functionality of Web search engines, is a user-centric process. However, click-through data, which is the source of implicit feedback of users, are not included in almost all of datasets published for the task of ranking. This limitation is also observable in the majority of benchmark datasets prepared for the learning to rank which is a new and promising trend in the information retrieval literature. In this paper, inspiring from the click-through data concept, the notion of click-through features is introduced. Click-through features could be derived from the given primitive dataset even in the absence of click-through data in the utilized benchmark dataset These features are categorized into three different categories and are either related to the users' queries, results of searches or clicks of users. With the use of click-through features, in this research, a novel learning to rank algorithm is proposed. By taking into account informativeness measures such as MAP, NDCG, InformationGain and OneR, at its first step, the proposed algorithm generates a classifier for each category of click-through features. Thereafter, these classifiers are fused together by using exponential ordered weighted averaging operators. Experimental results obtained from a plenty of investigations on WCL2R and LETOR4.0 benchmark datasets, demonstrate that the proposed method can substantially outperform well-known ranking methods in the presence of explicit click-through data based on MAP and NDCG criteria. Specifically, such an improvement is more noticeable on the top of ranked lists, which usually attract users' attentions more than other parts of these lists. This betterment on WCL2R dataset is about 20.25% for P@1 and 5.68% for P@3 in comparison with SVMRank, which is a well-known learning to rank algorithm. CF-Rank can also obtain higher or comparable performance with baseline methods even in the absence of explicit click-through data in utilized primitive datasets. In this regard, the proposed method on the LETOR4.0 dataset has achieved an improvement of about 2.7% on MAP measure compared to AdaRank-NDCG algorithm. (C) 2015 Elsevier Ltd. All rights reserved.
引用
收藏
页码:8597 / 8608
页数:12
相关论文
共 72 条
  • [11] [Anonymous], P LEARN RANK INF RET
  • [12] [Anonymous], 1994, P TREC
  • [13] Asadi Nima, 2013, Advances in Information Retrieval. 35th European Conference on IR Research, ECIR 2013. Proceedings, P146, DOI 10.1007/978-3-642-36973-5_13
  • [14] A novel detection and navigation approach based on OWA fusion method
    Badello, Mohammad Reza
    Moshiri, Behzad
    Araabi, Babak N.
    Tebianian, Hamed
    [J]. SENSOR REVIEW, 2011, 31 (04) : 328 - 340
  • [15] Soft computing techniques for rank aggregation on the World Wide Web
    Beg, MMS
    Ahmad, N
    [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2003, 6 (01): : 5 - 22
  • [16] Burges Chris, 2010, MSR-TR-2010-82
  • [17] Tune and mix: learning to rank using ensembles of calibrated multi-class classifiers
    Busa-Fekete, Robert
    Kegl, Balazs
    Elteto, Tamas
    Szarvas, Gyoergy
    [J]. MACHINE LEARNING, 2013, 93 (2-3) : 261 - 292
  • [18] Cao Z., 2007, Proceedings of the 24th International Conference on Machine Learning, V227, P129, DOI 10.1145/1273496.1273513
  • [19] Chengxiang Zhai, 2001, SIGIR Forum, P334
  • [20] A combined component approach for finding collection-adapted ranking functions based on genetic programming
    Federal Univ. of Minas Gerais, Dept. of Computer Science, Belo Horizonte, Brazil
    不详
    不详
    [J]. Proc. Annu. Int. ACM SIGIR Conf. Res. Dev. Inf. Retr., 2007, (399-406): : 399 - 406