Search Engine Optimization Using Unsupervised Learning

被引：0

作者：

Joglekar, Bela ^{[1
]}

Bhatia, Rohan ^{[1
]}

Jayaprakash, Soumya ^{[1
]}

Raina, Karan ^{[1
]}

Mulchandani, Saniya ^{[1
]}

机构：

[1] MIT, Dept IT, Pune, Maharashtra, India

来源：

2019 5TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, CONTROL AND AUTOMATION (ICCUBEA) | 2019年

关键词：

Search Engine Optimization; Search Query; Clustering; Ranking;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Nowadays, web has emerged as the most demanding tool for retrieving information over a large repository. As the amount of information on the world wide web grows, it becomes increasingly difficult to accurately find what we want. The existing search engines mostly display the content based on many factors and not just the quality of the content. These include sponsored links, advertisements, paid appreciation, etc. Our project aims at developing a tool to rank the search results solely on the basis of the content, and not by keeping in consideration that which article would the user be most likely to click. Hence there will be no problem of Clickbait. Therefore, the aim is to create a tool that could scan the web on a specific topic and create a synthesis of the content found. We do this by gathering search results from various search engines using a web crawler and processing the results obtained from them by using our custom made ranking algorithm which clusters the results and ranks them on the basis of the content quality. After crawling through the web and retrieving the information, we will be using `term frequency - inverse document frequency' as our weighting algorithm, followed by `singular value decomposition', for decomposition of the weighted matrix. Lastly, we will be using `spherical K-means' and custom ranking algorithm to display rich content. In order to give more efficient results, our project presents a new algorithm to rank web pages in accordance to the relevance of the user's query.

引用

页数：5