A hybrid semantic query expansion approach for Arabic information retrieval

被引:15
作者
ALMarwi, Hiba [1 ]
Ghurab, Mossa [1 ]
Al-Baltah, Ibrahim [2 ]
机构
[1] Sanaa Univ, Comp Sci Dept, Sanaa, Yemen
[2] Sanaa Univ, Informat Technol Dept, Sanaa, Yemen
关键词
Query expansion; Word embeddings; Particle swarm optimization; Information retrieval; WordNet; Term frequency;
D O I
10.1186/s40537-020-00310-z
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In fact, most of information retrieval systems retrieve documents based on keywords matching, which are certainly fail at retrieving documents that have similar meaning with syntactical different keywords (form). One of the well-known approaches to overcome this limitation is query expansion (QE). There are several approaches in query expansion field such as statistical approach. This approach depends on term frequency to generate expansion features; nevertheless it does not consider meaning or term dependency. In addition, there are other approaches such as semantic approach which depends on a knowledge base that has a limited number of terms and relations. In this paper, researchers propose a hybrid approach for query expansion which utilizes both statistical and semantic approach. To select the optimal terms for query expansion, researchers propose an effective weighting method based on particle swarm optimization (PSO). A system prototype was implemented as a proof-of-concept, and its accuracy was evaluated. The experimental was carried out based on real dataset. The experimental results confirm that the proposed approach enhances the accuracy of query expansion.
引用
收藏
页数:19
相关论文
共 44 条
[1]   A triliteral word roots extraction using neural network for Arabic [J].
Al-Serhan, Hasan ;
Ayesh, Aladdin .
2006 INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS, 2006, :436-+
[2]   On the Use of Fuzzy Information Retrieval for Gauging Similarity of Arabic Documents [J].
Alzahrani, Salha Mohammed ;
Salim, Naomie .
2009 SECOND INTERNATIONAL CONFERENCE ON THE APPLICATIONS OF DIGITAL INFORMATION AND WEB TECHNOLOGIES (ICADIWT 2009), 2009, :539-+
[3]  
[Anonymous], 2002, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, DOI DOI 10.1145/775047.775110
[4]   Semantically enhanced pseudo relevance feedback for Arabic information retrieval [J].
Atwan, Jaffar ;
Mohd, Masnizah ;
Rashaideh, Hasan ;
Kanaan, Ghassan .
JOURNAL OF INFORMATION SCIENCE, 2016, 42 (02) :246-260
[5]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[6]  
Boudchiche M, 2017, J KING SAUD UNIV-COM, V29, P141, DOI 10.1016/j.jksuci.2016.05.002
[7]   A Survey of Automatic Query Expansion in Information Retrieval [J].
Carpineto, Claudio ;
Romano, Giovanni .
ACM COMPUTING SURVEYS, 2012, 44 (01)
[8]  
Chauhan R, 2013, 2013 INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND SIGNAL PROCESSING (ISSP), P397, DOI 10.1109/ISSP.2013.6526942
[9]   The particle swarm - Explosion, stability, and convergence in a multidimensional complex space [J].
Clerc, M ;
Kennedy, J .
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2002, 6 (01) :58-73
[10]  
DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO