Word sense discrimination in information retrieval: A spectral clustering-based approach

被引:20
|
作者
Chifu, Adrian-Gabriel [1 ]
Hristea, Florentina [2 ]
Mothe, Josiane [3 ]
Popescu, Marius [2 ]
机构
[1] Univ Toulouse 3, Univ Toulouse, CNRS, IRIT UMR5505, F-31062 Toulouse 9, France
[2] Univ Bucharest, Fac Math & Comp Sci, Dept Comp Sci, RO-010014 Bucharest, Romania
[3] Univ Toulouse, Ecole Super Professorat & Educ, CNRS, IRIT UMR5505, F-31062 Toulouse 9, France
关键词
Information retrieval; Word sense disambiguation; Word sense discrimination; Spectral clustering; High precision;
D O I
10.1016/j.ipm.2014.10.007
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Word sense ambiguity has been identified as a cause of poor precision in information retrieval (IR) systems. Word sense disambiguation and discrimination methods have been defined to help systems choose which documents should be retrieved in relation to an ambiguous query. However, the only approaches that show a genuine benefit for word sense discrimination or disambiguation in IR are generally supervised ones. In this paper we propose a new unsupervised method that uses word sense discrimination in IR. The method we develop is based on spectral clustering and reorders an initially retrieved document list by boosting documents that are semantically similar to the target query. For several TREC ad hoc collections we show that our method is useful in the case of queries which contain ambiguous terms. We are interested in improving the level of precision after 5, 10 and 30 retrieved documents (P@5, P@10, P@30) respectively. We show that precision can be improved by 8% above current state-of-the-art baselines. We also focus on poor performing queries. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:16 / 31
页数:16
相关论文
共 50 条
  • [1] A clustering-based Approach for Unsupervised Word Sense Disambiguation
    Martin-Wanton, Tamara
    Berlanga-Llavori, Rafael
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2012, (49): : 49 - 56
  • [2] Word Sense Disambiguation based on IDF applied to Information Retrieval
    Perea-Ortega, Jose M.
    Martinez-Santiago, Fernando
    Garcia-Cumbreras, Miguel A.
    Montejo-Raez, Arturo
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2011, (46): : 99 - 106
  • [3] Feature selection for spectral clustering: to help or not to help spectral clustering when performing sense discrimination for IR?
    Chifu, Adrian-Gabriel
    Hristea, Florentina
    OPEN COMPUTER SCIENCE, 2018, 8 (01): : 218 - 227
  • [4] The long road from performing word sense disambiguation to successfully using it in information retrieval: An overview of the unsupervised approach
    Hristea, Florentina
    Colhon, Mihaela
    COMPUTATIONAL INTELLIGENCE, 2020, 36 (03) : 1026 - 1062
  • [5] Modularized Design of ACDCD: An Improved Spectral Clustering-Based Approach
    Bi, Qiu-Ping
    Li, Yu-Cheng
    Li, Rong
    Shen, Cheng
    Lou, Huan-Zhi
    Zhang, Yuan-Yuan
    SUSTAINABILITY, 2022, 14 (03)
  • [6] Word Sense Discrimination on Tweets: A Graph-based Approach
    Cecchini, Flavio Massimiliano
    Fersini, Elisabetta
    Messina, Enza
    2015 7TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (IC3K), 2015, : 138 - 146
  • [7] CD-Tree: A clustering-based dynamic indexing and retrieval approach
    Wan, Yuchai
    Liu, Xiabi
    Wu, Yi
    INTELLIGENT DATA ANALYSIS, 2017, 21 (02) : 243 - 261
  • [8] Arabic Word Sense Disambiguation for Information Retrieval
    Abderrahim, Mohammed Alaeddine
    Abderrahim, Mohammed El-Amine
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (04)
  • [9] Clustering-based Automated Requirement Trace Retrieval
    Al-walidi, Nejood Hashim
    Azab, Shahira Shaaban
    Khamis, Abdelaziz
    Darwish, Nagy Ramadan
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (12) : 783 - 792
  • [10] Word Sense Discrimination Based on Word-sense Category Extending
    Fan, Dongmei
    Lu, Zhimao
    Cheng, Guobin
    Zhang, Rubo
    PROCEEDINGS OF THE 2009 CHINESE CONFERENCE ON PATTERN RECOGNITION AND THE FIRST CJK JOINT WORKSHOP ON PATTERN RECOGNITION, VOLS 1 AND 2, 2009, : 649 - 653