A General Linear Mixed Models Approach to Study System Component Effects

被引:28
作者
Ferro, Nicola [1 ]
Silvello, Gianmaria [1 ]
机构
[1] Univ Padua, Dept Informat Engn, Padua, Italy
来源
SIGIR'16: PROCEEDINGS OF THE 39TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL | 2016年
关键词
D O I
10.1145/2911451.2911530
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Topic variance has a greater effect on performances than system variance but it cannot be controlled by system developers who can only try to cope with it. On the other hand, system variance is important on its own, since it is what system developers may affect directly by changing system components and it determines the differences among systems. In this paper, we face the problem of studying system variance in order to better understand how much system components contribute to overall performances. To this end, we propose a methodology based on General Linear Mixed Model (GLMM) to develop statistical models able to isolate system variance, component effects as well as their interaction by relying on a Grid of Points (GoP) containing all the combinations of analysed components. We apply the proposed methodology to the analysis of TREC Ad-hoc data in order to show how it works and discuss some interesting outcomes of this new kind of analysis. Finally, we extend the analysis to different evaluation measures, showing how they impact on the sources of variance.
引用
收藏
页码:25 / 34
页数:10
相关论文
共 28 条
[1]  
Agosti Maristella, 2012, SIGIR Forum, V46, P60
[2]   Blind Men and Elephants: Six Approaches to TREC data [J].
David Banks ;
Paul Over ;
Nien-Fan Zhang .
Information Retrieval, 1999, 1 (1-2) :7-34
[3]  
Boytsov L, 2013, SIGIR'13: THE PROCEEDINGS OF THE 36TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH & DEVELOPMENT IN INFORMATION RETRIEVAL, P403
[4]  
Buttcher S., 2010, INFORM RETRIEVAL IMP
[5]   Multiple Testing in Statistical Analysis of Systems-Based Information Retrieval Experiments [J].
Carterette, Benjamin A. .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2012, 30 (01)
[6]  
Di Buccio E., 2015, SIGIR RIGOR 2015
[7]  
Ferro N., 2009, LNCS, V6241, P552
[8]  
Hanbury A, 2010, LECT NOTES COMPUT SC, V6360, P124, DOI 10.1007/978-3-642-15998-5_14
[9]  
Hull D., 1993, SIGIR Forum, P329
[10]   Statistical comparisons of non-deterministic IR systems using two dimensional variance [J].
Jayasinghe, Gaya K. ;
Webber, William ;
Sanderson, Mark ;
Dharmasena, Lasitha S. ;
Culpepper, J. Shane .
INFORMATION PROCESSING & MANAGEMENT, 2015, 51 (05) :677-694