3.5K runs, 5K topics, 3M assessments and 70M measures: What trends in 10 years of Adhoc-ish CLEF?

被引:7
作者
Ferro, Nicola [1 ]
Silvello, Gianmaria [1 ]
机构
[1] Univ Padua, Dept Informat Engn, Via Gradenigo 6-B, I-35131 Padua, Italy
关键词
Information retrieval; Multilingual information access; Longitudinal analysis; Experimental evaluation; CLEF; INFORMATION ACCESS;
D O I
10.1016/j.ipm.2016.08.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multilingual information access and retrieval is a key concern in today global society and, despite the considerable achievements over the past years, it still presents many challenges. In this context, experimental evaluation represents a key driver of innovation and multilinguality is tackled in several evaluation initiatives worldwide, such as CLEF in Europe, NTCIR in Japan and Asia, and FIRE in India. All these activities have run several evaluation cycles and there is a general consensus about their strong and positive impact on the development of multilingual information access systems. However, a systematic and quantitative assessment of the impact of evaluation initiatives on multilingual information access and retrieval over the long period is still missing. Therefore, in this paper we conduct the first systematic and large-scale longitudinal study on several CLEF Adhoc-ish tasks - namely the Adhoc, Robust, TEL, and GeoCLEF labs - in order to gain insights on the performance trends of monolingual, bilingual and multilingual information access systems, spanning several European and non-European languages, over a range of 10 years. We learned that monolingual retrieval exhibits a stable positive trend for many of the languages analyzed, even though the performance increase is not always steady from year to year due to the varying interests of the participants, who may not always be focused on just increasing performances. Bilingual retrieval demonstrates higher improvements in recent years - probably due to the better language resources now available - and it also outperforms monolingual retrieval in several cases. Multilingual retrieval shows improvements over the years and performances are comparable to those of bilingual and monolingual retrieval, and sometimes even better. Moreover, we have found evidence that the rule-of-thumb of a 3-year duration for an evaluation task is typically enough since top performances are usually reached by the third year and sometimes even by the second year, which then leaves room for research groups to investigate relevant research issues other than top performances. Overall, this study provides quantitative evidence that CLEF has achieved the objective which led to its establishment, i.e. making multilingual information access a reality for European languages. However, the outcomes of this paper not only indicate that CLEF has steered the community in the right direction, but they also highlight the many open challenges for multilinguality. For instance, multilingual technologies greatly depend on language resources and targeted evaluation cycles help not only in developing and improving them, but also in devising methodologies which are more and more language-independent. Another key aspect concerns multimodality, intended not only as the capability of providing access to information in multiple media, but also as the ability of integrating access and retrieval over different media and languages in a way that best fits with user needs and tasks. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:175 / 202
页数:28
相关论文
共 62 条
[1]  
Agosti Maristella, 2012, Information Access Evaluation. Multilinguality, Multimodality, and Visual Analytics. Proceedings of the Third International Conference of the CLEF Initiative (CLEF 2012), P88, DOI 10.1007/978-3-642-33247-0_11
[2]  
Agosti M, 2009, CHANDOS INF PROF SER, P93
[3]   Probabilistic models of information retrieval based on measuring the divergence from randomness [J].
Amati, G ;
Van Rijsbergen, CJ .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2002, 20 (04) :357-389
[4]   Measuring and Analyzing the Scholarly Impact of Experimental Evaluation Initiatives [J].
Angelini, Marco ;
Ferro, Nicola ;
Larsen, Birger ;
Mueller, Henning ;
Santucci, Giuseppe ;
Silvello, Gianmaria ;
Tsikrika, Theodora .
10TH ITALIAN RESEARCH CONFERENCE ON DIGITAL LIBRARIES (IRCDL 2014), 2014, 38 :133-137
[5]  
[Anonymous], 1993, TREC 1
[6]  
[Anonymous], 1998, SIGIR 98 P 21 ANN IN, DOI DOI 10.1145/290941.291008
[7]  
Arguello J., 2016, SIGIR Forum, V49, P107
[8]  
Armstrong T.G., 2009, P 18 ACM C INFORM KN, P601, DOI [10.1145/1645953.1646031, DOI 10.1145/1645953, DOI 10.1145/1645953.1646031]
[9]   Has Adhoc Retrieval Improved Since 1994? [J].
Armstrong, Timothy G. ;
Moffat, Alistair ;
Webber, William ;
Zobel, Justin .
PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, :692-693
[10]   Blind Men and Elephants: Six Approaches to TREC data [J].
David Banks ;
Paul Over ;
Nien-Fan Zhang .
Information Retrieval, 1999, 1 (1-2) :7-34