On updating problems in latent semantic indexing

被引:87
作者
Zha, HY [1 ]
Simon, HD
机构
[1] Penn State Univ, Dept Comp Sci & Engn, Pond Lab 307, University Pk, PA 16802 USA
[2] Univ Calif Berkeley, Lawrence Berkeley Lab, NERSC, Berkeley, CA 94720 USA
关键词
singular value decomposition; updating problems; latent semantic indexing; information retrieval;
D O I
10.1137/S1064827597329266
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We develop new SVD-updating algorithms for three types of updating problems arising from latent semantic indexing (LSI) for information retrieval to deal with rapidly changing text document collections. We also provide theoretical justification for using a reduced-dimension representation of the original document collection in the updating process. Numerical experiments using several standard text document collections show that the new algorithms give higher (interpolated) average precisions than the existing algorithms, and the retrieval accuracy is comparable to that obtained using the complete document collection.
引用
收藏
页码:782 / 791
页数:10
相关论文
共 12 条
  • [1] [Anonymous], NIST SPECIAL PUBLICA
  • [2] Using linear algebra for intelligent information retrieval
    Berry, MW
    Dumais, ST
    OBrien, GW
    [J]. SIAM REVIEW, 1995, 37 (04) : 573 - 595
  • [3] DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO
  • [4] 2-9
  • [5] KOLDA TG, 1996, CSTR3724 U MAR DEP C
  • [6] KOWALSKI G, 1997, INFORMATION RETRIEVA
  • [7] LEXICAL AMBIGUITY AND INFORMATION-RETRIEVAL
    KROVETZ, R
    CROFT, WB
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1992, 10 (02) : 115 - 141
  • [8] OBRIEN GW, 1905, THESIS U TENNESSEE K
  • [9] Salton G., 1988, Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer
  • [10] SIMON HD, 1997, CSE97008 PENNS STAT