On the analysis and evaluation of information retrieval models for social book search

被引:4
作者
Ullah, Irfan [1 ]
Khusro, Shah [2 ]
机构
[1] Shaheed Benazir Bhutto Univ, Dept Comp Sci, Sheringal 18050, Pakistan
[2] Univ Peshawar, Dept Comp Sci, Peshawar 25120, Pakistan
关键词
Information retrieval; Retrieval models; Book retrieval; Social book search; Social metadata; PROBABILISTIC MODEL; DIVERGENCE; EXPANSION;
D O I
10.1007/s11042-022-13417-7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Social Book Search (SBS) studies how the Social Web impacts book retrieval. This impact is studied in two steps. In this first step, called the baseline run, the search index having bibliographic descriptions or professional metadata and user-generated content or social metadata is searched against the search queries and ranked using a retrieval model. In the second step, called re-ranking, the baseline search results are re-ordered using social metadata to see if the search relevance improves. However, this improvement in the search relevance can only be justified if the baseline run is made stronger by considering the contribution of the query, index, and retrieval model. Although the existing studies well-explored the role of query formulation and document representation, only a few considered the contribution of the retrieval models. Also, they experimented with a few retrieval models. This article fills this gap in the literature. It identifies the best retrieval model in the SBS context by experimenting with twenty-five retrieval models using the Terrier IR platform on the Amazon/LibraryThing dataset holding topic sets, relevance judgments, and a book corpus of 2.8 million records. The findings suggest that these retrieval models behave differently with changes in query and document representation. DirichletLM and InL2 are the best-performing retrieval models for a majority of the retrieval runs. The previous best-performing SBS studies would have produced better results if they had tested multiple retrieval models in selecting baseline runs. The findings confirm that the retrieval model plays a vital role in developing stronger baseline runs.
引用
收藏
页码:6431 / 6478
页数:48
相关论文
共 81 条
  • [1] Ahmad Amin, 2019, P 2 WORKSH MACH READ, P137
  • [2] Probabilistic models of information retrieval based on measuring the divergence from randomness
    Amati, G
    Van Rijsbergen, CJ
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2002, 20 (04) : 357 - 389
  • [3] Amati G, 2003, THESIS U GLASGOW GLA
  • [4] Amati G, 2007, NIST SPECIAL PUBLICA
  • [5] Amati G, 2006, LECT NOTES COMPUT SC, V3936, P13
  • [6] Fresh and Diverse Social Signals: Any Impacts on Search?
    Badache, Ismail
    Boughanem, Mohand
    [J]. CHIIR'17: PROCEEDINGS OF THE 2017 CONFERENCE HUMAN INFORMATION INTERACTION AND RETRIEVAL, 2017, : 155 - 164
  • [7] Bellot P., 2014, INFORM ACCESS EVALUA, P212, DOI [10.1007/978-3-319-11382-1_19, DOI 10.1007/978-3-319-11382-1_19]
  • [8] Bellot P, 2013, LECT NOTES COMPUT SC, V8138, P269, DOI 10.1007/978-3-642-40802-1_27
  • [9] Benkoussas C, 2015, CEUR WORKSHOP PROC, P1
  • [10] Benkoussas C, 2014, 5 INT C CLEF IN CLEF, P501