Context Driven Approach for Extracting Relevant Documents from WWW

被引:0
作者
Sarika [1 ]
Chaudhary, Meena [1 ]
机构
[1] Manav Rachna Coll Engn, Faridabad, India
来源
2015 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION & AUTOMATION (ICCCA) | 2015年
关键词
Word wide web; Latent Semantic Indexing; relevant pages; Singular Value Decomposition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The information world WWW has more than 3 billion HTML pages and these web pages gain access through search engines only.. Search engine is a program that searches the document for specified set of keywords and returns a list of documents where any or all of the specified keywords were found. As more information becomes available on the web, it is more difficult to provide effective search services for internet users. It is assumed that the user do not always formulate search queries using the best terms. This leads to increase in irrelevant search results. Moreover synonyms for the query terms are not searched for. Another problem is improper indexing of web documents. This leads to problem in information retrieval as the query terms does not correspond to words by which documents are indexed. Thus indexing of web documents affects its relevancy as well as web latency. A promising approach to overcoming these problems is Latent Semantic Indexing. This indexing scheme uses Singular Value Decomposition (SVD) to find the underlying latent semantic structure and relevant pages as a result.
引用
收藏
页码:837 / 842
页数:6
相关论文
共 4 条
  • [1] Deerwester S., Indexing by Latent Semantic Analysis
  • [2] Greenlaw Raymond, FUNDMENTAL INTERNET, P224
  • [3] Landauer T.K., 1998, INTRO LATENT SEMANTI
  • [4] Rosario Barbara, 2000, LATENT SEMANTIC INDE, V240