A novel Fuzzy-PSO term weighting automatic query expansion approach using combined semantic filtering

被引:24
作者
Gupta, Yogesh [1 ]
Saini, Ashish [2 ]
机构
[1] GLA Univ, Dept Comp Engn & Applicat, Mathura, Uttar Pradesh, India
[2] Dayalbagh Educ Inst, Dept Elect Engn, Agra, Uttar Pradesh, India
关键词
Automatic query expansion; Term weighting schemes; Co-occurrence score; Fuzzy logic; Particle swarm optimization; Term frequency; Inverse document frequency; INFORMATION-RETRIEVAL; JOINT REPLENISHMENT; ALGORITHM; MODEL; SCHEMES;
D O I
10.1016/j.knosys.2017.09.004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Information Retrieval system retrieves relevant documents from large datasets. Automatic Query Expansion (AQE) is one of the approaches to enhance IR performance by adding additional terms to original query. The selection of suitable additional terms for AQE is a crucial task. Term weighting method is one of the ways to deal with such a problem. This paper presents a new term weighting based AQE approach to retrieve more relevant documents from data corpus. The proposed approach comprises of three major steps. First step determines the optimal weights of different IR evidences for different terms using Particle Swarm Optimization (PSO). Fuzzy logic technique is used to improve performance of PSO by controlling inertia and acceleration coefficients during the optimization. Co-occurrence score is introduced as new IR evidence in the proposed approach. Second step is focused on removal of noisy terms by using new combined semantic filtering method. Third step reweights the terms using Rocchio method. The proposed approach is compared with recently developed automatic query expansion approaches in terms of performance measures such as precision, recall, F-measure and MAP (Mean Average Precision). Three benchmark datasets CACM, CISI and TREC-3 are used to verify the results. The proposed approach is found better than other approaches according to results obtained for these benchmark datasets. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:97 / 120
页数:24
相关论文
共 43 条
[1]  
Agrawal R., P 20 INT C VERY LARG
[2]  
Aguera P., 2010, P 3 INT SEM SEARCH W, P1
[3]  
Aji Ablimit., 2010, CIKM, P629
[4]  
Amati G., 2003, PROBABILISTIC MODELS
[5]  
[Anonymous], 1998, SIGIR 98 P 21 ANN IN, DOI DOI 10.1145/290941.291008
[6]  
[Anonymous], 2006, ACM TRANS ASIAN LANG
[7]  
Bartell B. T., 1994, SIGIR '94. Proceedings of the Seventeenth Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, P173
[8]  
Ben HE., 2003, Proceedings of the Twelfth International Conference on Information and Knowledge Management, CIKM '03, P10
[9]  
Chengxiang Zhai, 2001, SIGIR Forum, P334
[10]  
Chowdhury A., 2002, Proceedings of SIGIR 2002. Twenty-Fifth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P381