Latent Semantic Analysis (LSA) is widely used for finding the documents whose semantic is similar to the query of keywords. Although LSA yield promising similar results, the existing LSA algorithms involve lots of unnecessary operations in similarity computation and candidate check during on-line query processing, which is expensive in terms of time cost and cannot efficiently response the query request especially when the dataset becomes large. In this paper, we study the efficiency problem of on-line query processing for LSA towards efficiently searching the similar documents to a given query. We rewrite the similarity equation of LSA combined with an intermediate value called partial similarity that is stored in a designed index called partial index. For reducing the searching space, we give an approximate form of similarity equation, and then develop an efficient algorithm for building partial index, which skips the partial similarities lower than a given threshold.. Based on partial index, we develop an efficient algorithm called ILSA for supporting fast on-line query processing. The given query is transformed into a pseudo document vector, and the similarities between query and candidate documents are computed by accumulating the partial similarities obtained from the index nodes corresponds to non-zero entries in the pseudo document vector. Compared to the LSA algorithm, ILSA reduces the time cost of on-line query processing by pruning the candidate documents that are not promising and skipping the operations that make little contribution to similarity scores. Extensive experiments through comparison with LSA have been done, which demonstrate the efficiency and effectiveness of our proposed algorithm.
机构:
Vilnius Gediminas Tech Univ, Fac Fundamental Sci, Dept Informat Syst, Sauletekio Al 11, LT-10223 Vilnius, LithuaniaVilnius Gediminas Tech Univ, Fac Fundamental Sci, Dept Informat Syst, Sauletekio Al 11, LT-10223 Vilnius, Lithuania
Stefanovic, Pavel
Kurasova, Olga
论文数: 0引用数: 0
h-index: 0
机构:
Vilnius Univ, Inst Data Sci & Digital Technol, Akad Str 4, LT-08412 Vilnius, LithuaniaVilnius Gediminas Tech Univ, Fac Fundamental Sci, Dept Informat Syst, Sauletekio Al 11, LT-10223 Vilnius, Lithuania
机构:
Chongqing Univ Technol, Sch Mech Engn, Chongqing, Peoples R China
Chongqing Univ Technol, Sch Mech Engn, 69 Honghuang Ave, Chongqing 400054, Peoples R ChinaChongqing Univ Technol, Sch Mech Engn, Chongqing, Peoples R China
Zou, Zheng
Gao, Xu
论文数: 0引用数: 0
h-index: 0
机构:
Chongqing Univ Technol, Sch Mech Engn, Chongqing, Peoples R ChinaChongqing Univ Technol, Sch Mech Engn, Chongqing, Peoples R China
Gao, Xu
Lei, Sicong
论文数: 0引用数: 0
h-index: 0
机构:
Chongqing Univ Technol, Sch Mech Engn, Chongqing, Peoples R ChinaChongqing Univ Technol, Sch Mech Engn, Chongqing, Peoples R China
Lei, Sicong
Zhang, Hao
论文数: 0引用数: 0
h-index: 0
机构:
Chongqing Univ Technol, Sch Mech Engn, Chongqing, Peoples R ChinaChongqing Univ Technol, Sch Mech Engn, Chongqing, Peoples R China
Zhang, Hao
Min, Rongcheng
论文数: 0引用数: 0
h-index: 0
机构:
Chongqing Univ Technol, Sch Mech Engn, Chongqing, Peoples R ChinaChongqing Univ Technol, Sch Mech Engn, Chongqing, Peoples R China
Min, Rongcheng
Yang, Yong
论文数: 0引用数: 0
h-index: 0
机构:
Chongqing Univ Technol, Sch Mech Engn, Chongqing, Peoples R China
Chongqing Machine Tool Grp Co Ltd, Chongqing, Peoples R ChinaChongqing Univ Technol, Sch Mech Engn, Chongqing, Peoples R China