Evolving local and global weighting schemes in information retrieval

被引:29
|
作者
Cummins, Ronan [1 ]
O'Riordan, Colm [1 ]
机构
[1] Natl Univ Ireland, Dept Informat Technol, Galway, Ireland
来源
INFORMATION RETRIEVAL | 2006年 / 9卷 / 03期
关键词
genetic programming; information retrieval; term-weighting schemes;
D O I
10.1007/s10791-006-1682-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper describes a method, using Genetic Programming, to automatically determine term weighting schemes for the vector space model. Based on a set of queries and their human determined relevant documents, weighting schemes are evolved which achieve a high average precision. In Information Retrieval (IR) systems, useful information for term weighting schemes is available from the query, individual documents and the collection as a whole. We evolve term weighting schemes in both local (within-document) and global (collection-wide) domains which interact with each other correctly to achieve a high average precision. These weighting schemes are tested on well-known test collections and are compared to the traditional tf-idf weighting scheme and to the BM25 weighting scheme using standard IR performance metrics. Furthermore, we show that the global weighting schemes evolved on small collections also increase average precision on larger TREC data. These global weighting schemes are shown to adhere to Luhn's resolving power as both high and low frequency terms are assigned low weights. However, the local weightings evolved on small collections do not perform as well on large collections. We conclude that in order to evolve improved local (within-document) weighting schemes it is necessary to evolve these on large collections.
引用
收藏
页码:311 / 330
页数:20
相关论文
共 50 条
  • [1] Evolving local and global weighting schemes in information retrieval
    Ronan Cummins
    Colm O’Riordan
    Information Retrieval, 2006, 9 : 311 - 330
  • [2] Evolving general term-weighting schemes for information retrieval: Tests on larger collections
    Cummins, R
    O'riordan, C
    ARTIFICIAL INTELLIGENCE REVIEW, 2005, 24 (3-4) : 277 - 299
  • [3] Evolving General Term-Weighting Schemes for Information Retrieval: Tests on Larger Collections
    Ronan Cummins
    Colm O’riordan
    Artificial Intelligence Review, 2005, 24 : 277 - 299
  • [4] A study of Information Retrieval weighting schemes for sentiment analysis
    Paltoglou, Georgios
    Thelwall, Mike
    ACL 2010: 48TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2010, : 1386 - 1395
  • [5] Improving Information Retrieval Through a Global Term Weighting Scheme
    Cuellar, Daniel
    Diaz, Elva
    Ponce-de-Leon-Senti, Eunice
    PATTERN RECOGNITION (MCPR 2015), 2015, 9116 : 246 - 257
  • [6] Evolved term-weighting schemes in Information Retrieval: an analysis of the solution space
    Cummins, Ronan
    O'Riordan, Colm
    ARTIFICIAL INTELLIGENCE REVIEW, 2006, 26 (1-2) : 35 - 47
  • [7] Evolved term-weighting schemes in Information Retrieval: an analysis of the solution space
    Ronan Cummins
    Colm O’Riordan
    Artificial Intelligence Review, 2006, 26 : 35 - 47
  • [8] Evolving weighting schemes for the Bag of Visual Words
    Jair Escalante, Hugo
    Ponce-Lopez, Victor
    Escalera, Sergio
    Baro, Xavier
    Morales-Reyes, Alicia
    Martinez-Carranza, Jose
    NEURAL COMPUTING & APPLICATIONS, 2017, 28 (05): : 925 - 939
  • [9] Evolving weighting schemes for the Bag of Visual Words
    Hugo Jair Escalante
    Víctor Ponce-López
    Sergio Escalera
    Xavier Baró
    Alicia Morales-Reyes
    José Martínez-Carranza
    Neural Computing and Applications, 2017, 28 : 925 - 939
  • [10] An axiomatic comparison of learned term-weighting schemes in information retrieval: clarifications and extensions
    Cummins, Ronan
    O'Riordan, Colm
    ARTIFICIAL INTELLIGENCE REVIEW, 2007, 28 (01) : 51 - 68