Strong similarity measures for ordered sets of documents in information retrieval

被引:24
|
作者
Egghe, L
Michel, C
机构
[1] Limburgs Univ Ctr, B-3590 Diepenbeek, Belgium
[2] Univ Instelling Antwerp, B-2610 Antwerp, Belgium
[3] DU Bordeaux 3, MSHA, CEM GRESIC, F-33607 Pessac, France
关键词
D O I
10.1016/S0306-4573(01)00051-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A general method is presented to construct ordered similarity measures (OS-measures), i.e., similarity measures for ordered sets of documents (as, e.g., being the result of an IR-process), based on classical, well-known similarity measures for ordinary sets (measures such as Jaccard, Dice, Cosine or overlap measures). To this extent, we first present a review of these measures and their relationships. The method given here to construct OS-measures extends the one given by Michel in a previous paper so that it becomes applicable on any pair of ordered sets. Concrete expressions of this method, applied to the classical similarity measures, are given. Some of these measures are then tested in the IR-system Profil-Doc. The engine SPIRIT extracts ranked document sets in three different contexts, each for 550 requests. The practical usability of the OS-measures is then discussed based on these experiments. (C) 2002 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:823 / 848
页数:26
相关论文
共 50 条
  • [21] Clustering of documents via similarity measures
    Rezanková, H
    Húsek, D
    Smid, J
    Snásel, V
    CIC'03: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMMUNICATIONS IN COMPUTING, 2003, : 292 - 299
  • [22] A Comparison of Similarity Measures for Text Documents
    Hariharan, Shanmugasundaram
    Srinivasan, Rengaramanujam
    JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2008, 7 (01) : 1 - 8
  • [23] From horn strong backdoor sets to ordered strong backdoor sets
    Paris, Lionel
    Ostrowski, Richard
    Siegel, Pierre
    Sais, Lakhdar
    MICAI 2007: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2007, 4827 : 105 - +
  • [24] INFORMATION RETRIEVAL FOR SHORT DOCUMENTS
    Qi Haoliang Li Mu Gao Jianfeng Li Sheng Ministry of Education Microsoft Key Laboratory of Natural Language Processing and Speech Harbin Institute of Technology Harbin China Microsoft Research Asia Beijing China Microsoft Research Redmond WA USA
    Journal of Electronics, 2006, (06) : 933 - 936
  • [25] ANNOTATIONS ON DOCUMENTS FOR INFORMATION RETRIEVAL
    Patil, Vishal A.
    Khambre, Pankaj
    2016 INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2016,
  • [26] INFORMATION RETRIEVAL FOR SHORT DOCUMENTS
    Qi Haoliang Li Mu* Gao Jianfeng** Li Sheng (Ministry of Education - Microsoft Key Laboratory of Natural Language Processing and Speech (Harbin Institute of Technology)
    Journal of Electronics(China), 2006, (06) : 933 - 936
  • [27] Evaluation and analysis of similarity measures for content-based visual information retrieval
    Horst Eidenberger
    Multimedia Systems, 2006, 12 : 71 - 87
  • [28] Evaluation and analysis of similarity measures for content-based visual information retrieval
    Eidenberger, Horst
    MULTIMEDIA SYSTEMS, 2006, 12 (02) : 71 - 87
  • [29] Tagged sets, convex sets and quantum similarity measures
    Carbo-Dorca, R
    JOURNAL OF MATHEMATICAL CHEMISTRY, 1998, 23 (3-4) : 353 - 364
  • [30] Tagged sets, convex sets and quantum similarity measures
    Ramon Carbó‐Dorca
    Journal of Mathematical Chemistry, 1998, 23 : 353 - 364