A two-stage information retrieval system based on interactive multimodal genetic algorithm for query weight optimization

被引:6
作者
Cong, Hao [1 ]
Chen, Wei-Neng [1 ,2 ]
Yu, Wei-Jie [3 ]
机构
[1] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510006, Peoples R China
[2] Pazhou Lab, Guangzhou 510330, Peoples R China
[3] Sun Yat Sen Univ, Sch Informat Management, Guangzhou 510006, Peoples R China
基金
中国国家自然科学基金;
关键词
Query weight; Interactive; Multimodal; Genetic algorithm; EVOLUTIONARY COMPUTATION; RELEVANCE; CAPACITY;
D O I
10.1007/s40747-021-00450-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Query weight optimization, which aims to find an optimal combination of the weights of query terms for sorting relevant documents, is an important topic in the information retrieval system. Due to the huge search space, the query optimization problem is intractable, and evolutionary algorithms have become one popular approach. But as the size of the database grows, traditional retrieval approaches may return a lot of results, which leads to low efficiency and poor practicality. To solve this problem, this paper proposes a two-stage information retrieval system based on an interactive multimodal genetic algorithm (IMGA) for a query weight optimization system. The proposed IMGA has two stages: quantity control and quality optimization. In the quantity control stage, a multimodal genetic algorithm with the aid of the niching method selects multiple promising combinations of query terms simultaneously by which the numbers of retrieved documents are controlled in an appropriate range. In the quality optimization stage, an interactive genetic algorithm is designed to find the optimal query weights so that the most user-friendly document retrieval sequence can be yielded. Users' feedback information will accelerate the optimization process, and a genetic algorithm (GA) performs interactively with the action of relevance feedback mechanism. Replacing user evaluation, a mathematical model is built to evaluate the fitness values of individuals. In the proposed two-stage method, not only the number of returned results can be controlled, but also the quality and accuracy of retrieval can be improved. The proposed method is run on the database which with more than 2000 documents. The experimental results show that our proposed method outperforms several state-of-the-art query weight optimization approaches in terms of the precision rate and the recall rate.
引用
收藏
页码:2765 / 2781
页数:17
相关论文
共 47 条
[1]  
Abualigah LMQ., 2015, INT J COMPUTER SCI E, V5, P19, DOI DOI 10.5121/IJCSEA.2015.5102
[2]  
[Anonymous], 1971, The SMART Retrieval System-Experiments in Automatic Document Processing
[3]  
[Anonymous], 1968, Automatic Information Organization and Retrieval
[4]   Multi-Message Private Information Retrieval: Capacity Results and Near-Optimal Schemes [J].
Banawan, Karim ;
Ulukus, Sennur .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2018, 64 (10) :6842-6862
[5]  
Bartell BT, 1998, J AM SOC INFORM SCI, V49, P742, DOI 10.1002/(SICI)1097-4571(199806)49:8<742::AID-ASI8>3.0.CO
[6]  
2-H
[7]   A new query reweighting method for document retrieval based on genetic algorithms [J].
Chang, Yu-Chuan ;
Chen, Shyi-Ming .
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2006, 10 (05) :617-622
[8]  
Chen J, 2016, ADV INTEL SYS RES, V133, P114
[9]  
Chugh T, 2015, EVOLUTIONARY MULTICR
[10]   A review on the application of evolutionary computation to information retrieval [J].
Cordón, O ;
Herrera-Viedma, E ;
López-Pujalte, C ;
Luque, M ;
Zarco, C .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2003, 34 (2-3) :241-264