Matrices with low-rank-plus-shift structure: Partial SVD and latent semantic indexing

被引:15
作者
Zha, HY [1 ]
Zhang, ZY
机构
[1] Penn State Univ, Dept Comp Sci & Engn, University Pk, PA 16802 USA
[2] Zhejiang Univ, Dept Appl Math, Hangzhou 310027, Peoples R China
[3] Zhejiang Univ, Ctr Math Sci, Hangzhou 310027, Peoples R China
关键词
singular value decomposition; low-rank approximation; latent semantic indexing; perturbation analysis; large sparse matrix;
D O I
10.1137/S0895479898344443
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We present a detailed analysis of matrices satisfying the so-called low-rank-plus-shift property in connection with the computation of their partial singular value decomposition (SVD). The application we have in mind is latent semantic indexing for information retrieval, where the term-document matrices generated from a text corpus approximately satisfy this property. The analysis is motivated by developing more efficient methods for computing and updating partial SVD of large term-document matrices and gaining deeper understanding of the behavior of the methods in the presence of noise.
引用
收藏
页码:522 / 536
页数:15
相关论文
共 17 条
[1]   LARGE-SCALE SPARSE SINGULAR VALUE COMPUTATIONS [J].
BERRY, MW .
INTERNATIONAL JOURNAL OF SUPERCOMPUTER APPLICATIONS AND HIGH PERFORMANCE COMPUTING, 1992, 6 (01) :13-49
[2]   Using linear algebra for intelligent information retrieval [J].
Berry, MW ;
Dumais, ST ;
OBrien, GW .
SIAM REVIEW, 1995, 37 (04) :573-595
[3]  
DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO
[4]  
2-9
[5]  
Golub GH, 1989, MATRIX COMPUTATIONS
[6]  
KOWALSKI G, 1997, INFORMATION RETRIEVA
[7]   LEXICAL AMBIGUITY AND INFORMATION-RETRIEVAL [J].
KROVETZ, R ;
CROFT, WB .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1992, 10 (02) :115-141
[8]  
PARLETT BN, 1980, SYMMETRIC EIGENVALUE
[9]  
Salton G., 1988, Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer
[10]  
SIMON H, 1997, CSE97008