Frequent Term Based Text Document Clustering Using Similarity Measures: A Novel Approach

被引:0
|
作者
Gupta, Vijay Kumar [1 ]
Dutta, Maitreyee [2 ]
Kumar, Manoj [3 ]
机构
[1] Govt Girls Polytech, Dept IT, Charkhari, Mahoba, India
[2] NITTTR, Dept CS&E, Chandigarh, India
[3] BBDNITM, Dept IT, Lucknow, Uttar Pradesh, India
关键词
Clustering; Data Mining; Cosine Similarity; Similarity Index; Fuzzy Logic; Support Vector Machine; ALGORITHM;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Clustering is one of the epic and traditional ways to make sure that the documents are retrieved at the right pace and according to the requirement. Clustering leads to keeping the similar kind of documents all together and so that they can be retrieved easily. The measure through which the relation between two documents is measured is called similarity index. There are several kind of similarity index already in the process. The proposed algorithm uses two kind of similarity index and combines them to produce a new similarity index. Similarity index plays a vital role in the clustering and classification procedure. The proposed algorithm also uses Fuzzy logic for the clustering rules and furthermore it is classified by the Support Vector Machine to justify the accuracy of the proposed solution.
引用
收藏
页码:164 / 169
页数:6
相关论文
共 50 条
  • [1] Frequent Term Based Text Document Clustering: A New Approach
    Kumar, Manoj
    Yadav, D. K.
    Gupta, Vijay Kumar
    2015 INTERNATIONAL CONFERENCE ON SOFT COMPUTING TECHNIQUES AND IMPLEMENTATIONS (ICSCTI), 2015,
  • [2] A Frequent Term BasedText Clustering Approach Using Novel Similarity Measure
    Reddy, G. Suresh
    Rajinikanth, T. V.
    Rao, A. Ananda
    SOUVENIR OF THE 2014 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2014, : 495 - 499
  • [3] Analysis of similarity measures with WordNet based text document clustering
    Sandhya, Nadella
    Govardhan, A.
    Advances in Intelligent and Soft Computing, 2012, 132 AISC : 703 - 714
  • [4] Analysis of Similarity Measures with WordNet Based Text Document Clustering
    Sandhya, Nadella
    Govardhan, A.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS 2012 (INDIA 2012), 2012, 132 : 703 - +
  • [5] Medical document clustering using ontology-based term similarity measures
    College of Information Science and Technology, Drexel University, Philadelphia, PA, United States
    不详
    不详
    不详
    不详
    Int. J. Data Warehouse. Min., 2008, 1 (62-73):
  • [6] Efficient text document clustering with new similarity measures
    Lakshmi R.
    Baskar S.
    International Journal of Business Intelligence and Data Mining, 2021, 18 (01) : 109 - 126
  • [7] Text Clustering Approach Based on Maximal Frequent Term Sets
    Su, Chong
    Chen, Qingcai
    Wang, Xiaolong
    Meng, Xianjun
    2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 1551 - 1556
  • [8] A comparative study of ontology based term similarity measures on PubMed document clustering
    Zhang, Xiaodan
    Jing, Liping
    Hu, Xiaohua
    Ng, Michael
    Zhou, Xiaohua
    ADVANCES IN DATABASES: CONCEPTS, SYSTEMS AND APPLICATIONS, 2007, 4443 : 115 - +
  • [9] A Frequent Term-Based Multiple Clustering Approach for Text Documents
    Zheng, Hai-Tao
    Chen, Hao
    Gong, Shu-Qin
    WEB TECHNOLOGIES AND APPLICATIONS, APWEB 2014, 2014, 8709 : 602 - 609
  • [10] Text document clustering based on frequent word meaning sequences
    Li, Yanjun
    Chung, Soon M.
    Holt, John D.
    DATA & KNOWLEDGE ENGINEERING, 2008, 64 (01) : 381 - 404