Frequent Term Based Text Document Clustering Using Similarity Measures: A Novel Approach

被引：0

作者：

Gupta, Vijay Kumar ^{[1
]}

Dutta, Maitreyee ^{[2
]}

Kumar, Manoj ^{[3
]}

机构：

[1] Govt Girls Polytech, Dept IT, Charkhari, Mahoba, India

[2] NITTTR, Dept CS&E, Chandigarh, India

[3] BBDNITM, Dept IT, Lucknow, Uttar Pradesh, India

来源：

2017 FOURTH INTERNATIONAL CONFERENCE ON IMAGE INFORMATION PROCESSING (ICIIP) | 2017年

关键词：

Clustering; Data Mining; Cosine Similarity; Similarity Index; Fuzzy Logic; Support Vector Machine; ALGORITHM;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Clustering is one of the epic and traditional ways to make sure that the documents are retrieved at the right pace and according to the requirement. Clustering leads to keeping the similar kind of documents all together and so that they can be retrieved easily. The measure through which the relation between two documents is measured is called similarity index. There are several kind of similarity index already in the process. The proposed algorithm uses two kind of similarity index and combines them to produce a new similarity index. Similarity index plays a vital role in the clustering and classification procedure. The proposed algorithm also uses Fuzzy logic for the clustering rules and furthermore it is classified by the Support Vector Machine to justify the accuracy of the proposed solution.

引用

页码：164 / 169

页数：6

共 50 条

[21] Fusion Matrix–Based Text Similarity Measures for Clustering of Retrieval Results
Yueyang Zhao
Lei Cui
Scientometrics, 2023, 128 : 1163 - 1186
[22] An Intelligent Similarity Measure for Effective Text Document Clustering
Aishwarya, M. L.
Selvi, K.
2016 INTERNATIONAL CONFERENCE ON COMPUTING TECHNOLOGIES AND INTELLIGENT DATA ENGINEERING (ICCTIDE'16), 2016,
[23] Novel Similarity Measure for Document Clustering Based on Topic Phrases
ELdesoky, A. E.
Saleh, M.
Sakr, N. A.
ICNM: 2009 INTERNATIONAL CONFERENCE ON NETWORKING & MEDIA CONVERGENCE, 2007, : 92 - +
[24] Text clustering using frequent itemsets
Zhang, Wen
Yoshida, Taketoshi
Tang, Xijin
Wang, Qing
KNOWLEDGE-BASED SYSTEMS, 2010, 23 (05) : 379 - 388
[25] A novel ant-based clustering approach for document clustering
He, Yulan
Hui, Sin Cheung
Sim, Yongxiang
INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2006, 4182 : 537 - 544
[26] Document Clustering Using K-Means with Term Weighting as Similarity-Based Constraints
Buatoom, Uraiwan
Kongprawechnon, Waree
Theeramunkong, Thanaruk
SYMMETRY-BASEL, 2020, 12 (06):
[27] Text-based Document Similarity Matching Using sdtext
Shields, Clay
PROCEEDINGS OF THE 49TH ANNUAL HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES (HICSS 2016), 2016, : 5607 - 5616
[28] An Active Learning Approach to Frequent Itemset-Based Text Clustering
Marcacini, Ricardo M.
Correa, Geraldo N.
Rezende, Solange O.
2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 3529 - 3532
[29] Document Clustering Based on Fuzzy Similarity
Zhou, Jingli
Nie, Xuejun
Qin, Leihua
Zhu, Jianfeng
APPLIED MECHANICS AND MECHANICAL ENGINEERING, PTS 1-3, 2010, 29-32 : 2620 - 2626
[30] Fusion Matrix-Based Text Similarity Measures for Clustering of Retrieval Results
Zhao, Yueyang
Cui, Lei
SCIENTOMETRICS, 2023, 128 (02) : 1163 - 1186

← 1 2 3 4 5 →