Document Clustering Using Hybrid XOR Similarity Function for Efficient Software Component Reuse

被引：8

作者：

Radhakrishna, Vangipuram ^{[1
]}

Srinivas, C. ^{[2
]}

Rao, C. V. Guru ^{[3
]}

机构：

[1] VNR Vignana Jyothi Inst Engn & Technol, Dept IT, Hyderabad, Andhra Pradesh, India

[2] Kakatiya Inst Technol, Warangal, Andhra Pradesh, India

[3] SR Engn Coll, Warangal, Andhra Pradesh, India

来源：

FIRST INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT | 2013年 / 17卷

关键词：

hybrid xor; clustering; frequent itemsets; cluster; DISCOVERY;

D O I：

10.1016/j.procs.2013.05.017

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

hi this paper a generalized approach is proposed for clustering a set of given documents or text files or software components for reuse based on the new similarity function called hybrid XOR function defined for the purpose of finding degree of similarity among two document sets or any two software components. We construct a matrix called similarity matrix of order n-1 by n for n document sets or software components by applying hybrid XOR function for each pair of document sets. We define and design the clustering algorithm which has its input as similarity matrix and output as a set of clusters formed dynamically as compared to other clustering algorithms that predefine the count of clusters and documents being tit to one of those clusters or classes finally. The approach carried out uses simple computations. (C) 2013 The Authors. Published by Elsevier B.V. Selection and peer-review under responsibility of the organizers of the 2013 International Conference on information Technology and Quantitative Management

引用

页码：121 / 128

页数：8

共 16 条

[1]

Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072

[2]

[Anonymous], 2002, P 8 ACM SIGKDD INT C, DOI DOI 10.1145/775047.775110

[3]

Hajdinjak Melita, 2009, INFORMATICA, V33, P143

[4] Discovery of maximum length frequent itemsets [J].

Hu, Tianming ;

Sung, Sam Yuan ;

Xiong, Hui ;

Fu, Qian .

INFORMATION SCIENCES, 2008, 178 (01) :69-87

[5]

Jiang Jung-i, 2011, IEEE T KNOWLEDGE DAT, V23

[6]

Khuzaima S.Daudjee, 1994, ORG REUSABLE SOFTWAR

[7]

KOU G, 2012, MULTIPLE FACTOR HIER, V197, P123

[8]

Kumar Sunil, 2007, P IEEE INT S COMP IN

[9] Text document clustering based on neighbors [J].

Luo, Congnan ;

Li, Yanjun ;

Chung, Soon M. .

DATA & KNOWLEDGE ENGINEERING, 2009, 68 (11) :1271-1288

[10]

Mitchell Brain S., 2001, P IEEE INT C SOFTW M, P744

← 1 2 →