Bi-Gram Term Collocations-based Query Expansion Approach for Improving Arabic Information Retrieval

被引:0
|
作者
Ibrahim Moawad
Waseem Alromima
Rania Elgohary
机构
[1] Ain Shams University,Faculty of Computer and Information Sciences
[2] Taibah University,Department of Computer Science and Information
关键词
Arabic information retrieval; Term collocations; Query expansion; Semantic information Retrieval; Holy Quran;
D O I
暂无
中图分类号
学科分类号
摘要
In the era of information overloading, information retrieval systems are vital applications. Many researchers try to enhance the search results by introducing new methods. Unlike the English language, some languages like Arabic have complex morphological aspects and lack both linguistic and semantic resources. This paper proposes a language-independent semantic-based information retrieval approach, which expands the user query using bi-gram term collocations. The proposed approach has two main contributions. First, the bi-gram term collocations employed to expand the user query are automatically mined from the text corpus, therefore there is no need for an external semantic resource. Second, due to the complexity of the language morphology, the system index is constructed using the corpus words to save the cost and effort of the stemming process. A system prototype for the Arabic language was implemented and evaluated versus the stem-based method. The experimental evaluation has been conducted on the scripts of the Arabic Holy Quran. The evaluation results demonstrate that the proposed system outperforms the stem-based method in terms of precision and recall.
引用
收藏
页码:7705 / 7718
页数:13
相关论文
共 50 条