CF-Rank: Learning to rank by classifier fusion on click-through data

被引:3
作者
Keyhanipour, Amir Hosein [1 ]
Moshiri, Behzad [1 ]
Rahgozar, Maseud [1 ]
机构
[1] Univ Tehran, Sch ECE, Ctr Excellence, Control & Intelligent Proc, Tehran, Iran
关键词
Learning to Rank; Click-through Data; Classifier Fusion; USERS PREFERENCES; AGGREGATION;
D O I
10.1016/j.eswa.2015.07.014
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ranking as a key functionality of Web search engines, is a user-centric process. However, click-through data, which is the source of implicit feedback of users, are not included in almost all of datasets published for the task of ranking. This limitation is also observable in the majority of benchmark datasets prepared for the learning to rank which is a new and promising trend in the information retrieval literature. In this paper, inspiring from the click-through data concept, the notion of click-through features is introduced. Click-through features could be derived from the given primitive dataset even in the absence of click-through data in the utilized benchmark dataset These features are categorized into three different categories and are either related to the users' queries, results of searches or clicks of users. With the use of click-through features, in this research, a novel learning to rank algorithm is proposed. By taking into account informativeness measures such as MAP, NDCG, InformationGain and OneR, at its first step, the proposed algorithm generates a classifier for each category of click-through features. Thereafter, these classifiers are fused together by using exponential ordered weighted averaging operators. Experimental results obtained from a plenty of investigations on WCL2R and LETOR4.0 benchmark datasets, demonstrate that the proposed method can substantially outperform well-known ranking methods in the presence of explicit click-through data based on MAP and NDCG criteria. Specifically, such an improvement is more noticeable on the top of ranked lists, which usually attract users' attentions more than other parts of these lists. This betterment on WCL2R dataset is about 20.25% for P@1 and 5.68% for P@3 in comparison with SVMRank, which is a well-known learning to rank algorithm. CF-Rank can also obtain higher or comparable performance with baseline methods even in the absence of explicit click-through data in utilized primitive datasets. In this regard, the proposed method on the LETOR4.0 dataset has achieved an improvement of about 2.7% on MAP measure compared to AdaRank-NDCG algorithm. (C) 2015 Elsevier Ltd. All rights reserved.
引用
收藏
页码:8597 / 8608
页数:12
相关论文
共 72 条
  • [1] Agichtein E., 2006, Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P19, DOI 10.1145/1148170.1148177
  • [2] Effective rank aggregation for metasearching
    Akritidis, Leonidas
    Katsaros, Dimitrios
    Bozanis, Panayiotis
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2011, 84 (01) : 130 - 143
  • [3] Alcantara O.D., 2010, Journal of Information and Data Management, V1, P551
  • [4] Error reduction through learning multiple descriptions
    Ali, KM
    Pazzani, MJ
    [J]. MACHINE LEARNING, 1996, 24 (03) : 173 - 202
  • [5] [Anonymous], 2002, P SIGKDD
  • [6] [Anonymous], 2007, Proceedings of SIGIR 2007 Workshop on Learning to Rank for Information Retrieval
  • [7] [Anonymous], 2011, P 34 INT ACM SIGIR C
  • [8] [Anonymous], 2011, Proceedings of the Yahoo! Learning to Rank Challenge
  • [9] [Anonymous], 2014, TODOCL SEARCH ENGINE
  • [10] [Anonymous], 2010, WCL2R DATASET