Document Clustering Based on Fuzzy Similarity

被引:0
|
作者
Zhou, Jingli [1 ]
Nie, Xuejun [1 ]
Qin, Leihua [1 ]
Zhu, Jianfeng [1 ]
机构
[1] Huazhong Univ Sci & Technol, Coll Comp Sci & Technol, Wuhan 430074, Hubei, Peoples R China
来源
APPLIED MECHANICS AND MECHANICAL ENGINEERING, PTS 1-3 | 2010年 / 29-32卷
关键词
Document Clustering; Fuzzy Similarity; Mutual Information;
D O I
10.4028/www.scientific.net/AMM.29-32.2620
中图分类号
TH [机械、仪表工业];
学科分类号
0802 ;
摘要
This paper proposes a novel fuzzy similarity measure based on the relationships between terms and categories. A term-category matrix is presented to represent such relationships and each element in the matrix denotes a membership degree that a term belongs to a category, which is computed using term frequency inverse document frequency and fuzzy relationships between documents and categories. Fuzzy similarity takes into account the situation that one document belongs to multiple categories and is computed using fuzzy operators. The experimental results show that the proposed fuzzy similarity surpasses other common similarity measures not only in the reliable derivation of document clustering results, but also in document clustering accuracies.
引用
收藏
页码:2620 / 2626
页数:7
相关论文
共 50 条
  • [1] Adaptive document clustering based on query-based similarity
    Na, Seung-Hoon
    Kang, In-Su
    Lee, Jong-Hyeok
    INFORMATION PROCESSING & MANAGEMENT, 2007, 43 (04) : 887 - 901
  • [2] Hierarchical Document Clustering based on Cosine Similarity measure
    Popat, Shraddha K.
    Deshmukh, Pramod B.
    Metre, Vishakha A.
    2017 1ST INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND INFORMATION MANAGEMENT (ICISIM), 2017, : 153 - 159
  • [3] WordNet and Semantic Similarity based Approach for Document Clustering
    Desai, Sneha S.
    Laxminarayana, J. A.
    2016 INTERNATIONAL CONFERENCE ON COMPUTATION SYSTEM AND INFORMATION TECHNOLOGY FOR SUSTAINABLE SOLUTIONS (CSITSS), 2016, : 312 - 317
  • [4] Efficient phrase-based document similarity for clustering
    Chim, Hung
    Deng, Xiaotie
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2008, 20 (09) : 1217 - 1229
  • [5] Study of clustering validity based on fuzzy similarity
    Pei, JH
    Yang, X
    PROCEEDINGS OF THE 3RD WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-5, 2000, : 2444 - 2447
  • [6] Document Clustering Based on Fuzzy Rough Set
    Zhou Peng
    Li Zhishu
    Cheng Yang
    Huang Zhiguo
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMMUNICATION SOFTWARE AND NETWORKS, 2009, : 701 - +
  • [7] Analysis of similarity measures with WordNet based text document clustering
    Sandhya, Nadella
    Govardhan, A.
    Advances in Intelligent and Soft Computing, 2012, 132 AISC : 703 - 714
  • [8] Novel Similarity Measure for Document Clustering Based on Topic Phrases
    ELdesoky, A. E.
    Saleh, M.
    Sakr, N. A.
    ICNM: 2009 INTERNATIONAL CONFERENCE ON NETWORKING & MEDIA CONVERGENCE, 2007, : 92 - +
  • [9] Affinity-based similarity measure for web document clustering
    Shyu, ML
    Chen, SC
    Chen, M
    Rubin, SH
    PROCEEDINGS OF THE 2004 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI-2004), 2004, : 247 - 252
  • [10] Analysis of Similarity Measures with WordNet Based Text Document Clustering
    Sandhya, Nadella
    Govardhan, A.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS 2012 (INDIA 2012), 2012, 132 : 703 - +