Browser with Clustering of Web Documents

被引:1
|
作者
Tetali, Ravitheja [1 ]
Bose, Joy [1 ]
Arif, Tasleem [1 ]
机构
[1] Samsung Res Inst India Bangalore, WMG Grp, Bangalore, Karnataka, India
来源
2013 SECOND INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING, NETWORKING AND SECURITY (ADCONS 2013) | 2013年
关键词
document clustering; Web browser; MajorClust; intelligent browsing; Web history;
D O I
10.1109/ADCONS.2013.20
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Accessing relevant information quickly, given limited time and space, is a major issue in Web browsers, especially those in mobile devices. In this paper we propose a framework for grouping similar Web documents in a browser based on similar content of the browsed pages. This grouping can help reduce clutter and enable the user to access relevant Web information quickly. The algorithm we used for clustering is MajorClust, a document similarity algorithm based on tokenizing the words in the document and then determining a cosine similarity measure to estimate the distance between the words. The entire algorithm for clustering is implemented inside the browser without the need of an external Web server. We have implemented and tested the algorithm on a mobile browser and obtained accurate finer clustering of Web pages when compared to Alexa's sub-categories.
引用
收藏
页码:164 / 168
页数:5
相关论文
共 50 条
  • [1] Viewing multilingual documents on your local Web browser
    Maeda, A
    Dartois, M
    Fujita, T
    Sakaguchi, T
    Sugimoto, S
    Tabata, K
    COMMUNICATIONS OF THE ACM, 1998, 41 (04) : 64 - 65
  • [2] Web documents clustering with interest links
    Cui, ZF
    Xu, BW
    Zhang, WF
    Xu, JL
    SOSE 2005: IEEE INTERNATIONAL WORKSHOP ON SERVICE-ORIENTED SYSTEM ENGINEERING, 2005, : 111 - 116
  • [3] Semantic based clustering of web documents
    Lin, TY
    Chiang, IJ
    2005 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2, 2005, : 189 - 192
  • [4] Fast fuzzy clustering of Web documents
    Wang, Jian-Hui
    Jiang, Long-Bin
    Yang, Shu
    Chang'an Daxue Xuebao (Ziran Kexue Ban)/Journal of Chang'an University (Natural Science Edition), 2007, 27 (02): : 107 - 110
  • [5] Clustering template based web documents
    Gottron, Thomas
    ADVANCES IN INFORMATION RETRIEVAL, 2008, 4956 : 40 - 51
  • [6] Clustering of Short Commercial Documents for the Web
    Carullo, Moreno
    Binaghi, Elisabetta
    Gallo, Ignazio
    Lamberti, Nicola
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 1873 - +
  • [7] Link-Based Clustering Algorithm for Clustering Web Documents
    Ashokkumar, P.
    Don, S.
    JOURNAL OF TESTING AND EVALUATION, 2019, 47 (06) : 4096 - 4107
  • [8] Clustering Retrieved Web Documents to Speed Up Web Searches
    Qumsiyeh, Rani
    Ng, Yiu-Kai
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2015, PT I, 2015, 9155 : 472 - 488
  • [9] Clustering Web Documents with Tables for Information Extraction
    Shchekotykhin, Kostyantyn
    Jannach, Dietmar
    Friedrich, Gerhard
    K-CAP'07: PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON KNOWLEDGE CAPTURE, 2007, : 169 - 170
  • [10] Clustering XML Documents for Web Based Learning
    Periakaruppan, Ramanathan
    Nadarajan, Rethinaswamy
    ADVANCES IN WEB-BASED LEARNING, 2015, 8390 : 234 - 243