Browser with Clustering of Web Documents

被引：1

作者：

Tetali, Ravitheja ^{[1
]}

Bose, Joy ^{[1
]}

Arif, Tasleem ^{[1
]}

机构：

[1] Samsung Res Inst India Bangalore, WMG Grp, Bangalore, Karnataka, India

来源：

2013 SECOND INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING, NETWORKING AND SECURITY (ADCONS 2013) | 2013年

关键词：

document clustering; Web browser; MajorClust; intelligent browsing; Web history;

D O I：

10.1109/ADCONS.2013.20

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Accessing relevant information quickly, given limited time and space, is a major issue in Web browsers, especially those in mobile devices. In this paper we propose a framework for grouping similar Web documents in a browser based on similar content of the browsed pages. This grouping can help reduce clutter and enable the user to access relevant Web information quickly. The algorithm we used for clustering is MajorClust, a document similarity algorithm based on tokenizing the words in the document and then determining a cosine similarity measure to estimate the distance between the words. The entire algorithm for clustering is implemented inside the browser without the need of an external Web server. We have implemented and tested the algorithm on a mobile browser and obtained accurate finer clustering of Web pages when compared to Alexa's sub-categories.

引用

页码：164 / 168

页数：5

共 50 条

[1] Viewing multilingual documents on your local Web browser
Maeda, A
Dartois, M
Fujita, T
Sakaguchi, T
Sugimoto, S
Tabata, K
COMMUNICATIONS OF THE ACM, 1998, 41 (04) : 64 - 65
[2] Web documents clustering with interest links
Cui, ZF
Xu, BW
Zhang, WF
Xu, JL
SOSE 2005: IEEE INTERNATIONAL WORKSHOP ON SERVICE-ORIENTED SYSTEM ENGINEERING, 2005, : 111 - 116
[3] Semantic based clustering of web documents
Lin, TY
Chiang, IJ
2005 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2, 2005, : 189 - 192
[4] Fast fuzzy clustering of Web documents
Wang, Jian-Hui
Jiang, Long-Bin
Yang, Shu
Chang'an Daxue Xuebao (Ziran Kexue Ban)/Journal of Chang'an University (Natural Science Edition), 2007, 27 (02): : 107 - 110
[5] Clustering template based web documents
Gottron, Thomas
ADVANCES IN INFORMATION RETRIEVAL, 2008, 4956 : 40 - 51
[6] Clustering of Short Commercial Documents for the Web
Carullo, Moreno
Binaghi, Elisabetta
Gallo, Ignazio
Lamberti, Nicola
19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 1873 - +
[7] Link-Based Clustering Algorithm for Clustering Web Documents
Ashokkumar, P.
Don, S.
JOURNAL OF TESTING AND EVALUATION, 2019, 47 (06) : 4096 - 4107
[8] Clustering Retrieved Web Documents to Speed Up Web Searches
Qumsiyeh, Rani
Ng, Yiu-Kai
COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2015, PT I, 2015, 9155 : 472 - 488
[9] Clustering Web Documents with Tables for Information Extraction
Shchekotykhin, Kostyantyn
Jannach, Dietmar
Friedrich, Gerhard
K-CAP'07: PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON KNOWLEDGE CAPTURE, 2007, : 169 - 170
[10] Clustering XML Documents for Web Based Learning
Periakaruppan, Ramanathan
Nadarajan, Rethinaswamy
ADVANCES IN WEB-BASED LEARNING, 2015, 8390 : 234 - 243

← 1 2 3 4 5 →