Word sense discrimination in information retrieval: A spectral clustering-based approach

被引:20
|
作者
Chifu, Adrian-Gabriel [1 ]
Hristea, Florentina [2 ]
Mothe, Josiane [3 ]
Popescu, Marius [2 ]
机构
[1] Univ Toulouse 3, Univ Toulouse, CNRS, IRIT UMR5505, F-31062 Toulouse 9, France
[2] Univ Bucharest, Fac Math & Comp Sci, Dept Comp Sci, RO-010014 Bucharest, Romania
[3] Univ Toulouse, Ecole Super Professorat & Educ, CNRS, IRIT UMR5505, F-31062 Toulouse 9, France
关键词
Information retrieval; Word sense disambiguation; Word sense discrimination; Spectral clustering; High precision;
D O I
10.1016/j.ipm.2014.10.007
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Word sense ambiguity has been identified as a cause of poor precision in information retrieval (IR) systems. Word sense disambiguation and discrimination methods have been defined to help systems choose which documents should be retrieved in relation to an ambiguous query. However, the only approaches that show a genuine benefit for word sense discrimination or disambiguation in IR are generally supervised ones. In this paper we propose a new unsupervised method that uses word sense discrimination in IR. The method we develop is based on spectral clustering and reorders an initially retrieved document list by boosting documents that are semantically similar to the target query. For several TREC ad hoc collections we show that our method is useful in the case of queries which contain ambiguous terms. We are interested in improving the level of precision after 5, 10 and 30 retrieved documents (P@5, P@10, P@30) respectively. We show that precision can be improved by 8% above current state-of-the-art baselines. We also focus on poor performing queries. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:16 / 31
页数:16
相关论文
共 50 条
  • [41] Research on information retrieval system based on ant clustering algorithm
    Liu, Peiyu
    Zhu, Zhenfang
    Zhao, Lina
    Journal of Software, 2009, 4 (09) : 1032 - 1036
  • [42] A Conditional Mutual Information Based Selectional Association and Word Sense Disambiguation
    Guo, Xiao
    Li, Dayou
    Clapworthy, Cordon
    IEEE NLP-KE 2009: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2009, : 249 - 255
  • [43] A Genetic Algorithm Based Approach for Hindi Word Sense Disambiguation
    Athaiya, Anidhya
    Modi, Deepa
    Pareek, Gunjan
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON COMMUNICATION AND ELECTRONICS SYSTEMS (ICCES 2018), 2018, : 11 - 14
  • [44] A Supervised Approach for Word Sense Disambiguation based on Arabic Diacritics
    Alrakaf, Alaa Abdullah
    Rahman, Sk. Md. Mizanur
    2016 5TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS AND VISION (ICIEV), 2016, : 1015 - 1021
  • [45] An Information Retrieval Approach Based on Time Attributes
    Zheng, Xiaolin
    Zhou, Chaofei
    Shen, Yisu
    INFORMATION AND BUSINESS INTELLIGENCE, PT I, 2012, 267 : 78 - 85
  • [46] Category based customization approach for information retrieval
    Aihara, K
    Takasu, A
    USER MODELING 2001, PROCEEDINGS, 2001, 2109 : 207 - 209
  • [47] Term weighting for information retrieval based on term's discrimination power
    Li, Qing
    Lee, Seungwoo
    Jung, Hanmin
    Lee, Yeong Su
    Cho, Jae-Hyun
    Song, Sa-kwang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 71 (02) : 769 - 781
  • [48] Term weighting for information retrieval based on term’s discrimination power
    Qing Li
    Seungwoo Lee
    Hanmin Jung
    Yeong Su Lee
    Jae-Hyun Cho
    Sa-kwang Song
    Multimedia Tools and Applications, 2014, 71 : 769 - 781
  • [49] Web pages clustering and concepts mining: An approach towards intelligent information retrieval
    Li, Fang
    Mehlitz, Martin
    Fen, Li
    Sheng, Huanye
    2006 IEEE CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2006, : 522 - +
  • [50] Query Expansion based on Word Embeddings and Ontologies for Efficient Information Retrieval
    Rastogi, Namrata
    Verma, Parul
    Kumar, Pankaj
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (11) : 367 - 373