Improved Similarity Measures For Software Clustering

被引:18
|
作者
Naseem, Rashid [1 ]
Maqbool, Onaiza [1 ]
Muhammad, Siraj [2 ]
机构
[1] Quaid I Azam Univ, Dept Comp Sci, Islamabad, Pakistan
[2] Elixir Technol Pakistan PVT LTD, Islamabad, Pakistan
来源
2011 15TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING (CSMR) | 2011年
关键词
Software Clustering; Jaccard-NM Measure; Jaccard Measure; Unbiased Ellenberg-NM Measure; Russell & Rao Measure;
D O I
10.1109/CSMR.2011.9
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Software clustering is a useful technique to recover architecture of a software system. The results of clustering depend upon choice of entities, features, similarity measures and clustering algorithms. Different similarity measures have been used for determining similarity between entities during the clustering process. In software architecture recovery domain the Jaccard and the Unbiased Ellenberg measures have shown better results than other measures for binary and non-binary features respectively. In this paper we analyze the Russell and Rao measure for binary features to show the conditions under which its performance is expected to be better than that of Jaccard. We also show how our proposed Jaccard-NM measure is suitable for software clustering and propose its counterpart for non-binary features. Experimental results indicate that our proposed Jaccard-NM measure and Russell & Rao measure perform better than Jaccard measure for binary features, while for non-binary features, the proposed Unbiased Ellenberg-NM measure produces results which are closer to the decomposition prepared by experts.
引用
收藏
页码:45 / 54
页数:10
相关论文
共 50 条
  • [1] Improved binary similarity measures for software modularization
    Naseem, Rashid
    Deris, Mustafa Bin Mat
    Maqbool, Onaiza
    Li, Jing-peng
    Shahzad, Sara
    Shah, Habib
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2017, 18 (08) : 1082 - 1107
  • [2] Improved binary similarity measures for software modularization
    Rashid Naseem
    Mustafa Bin Mat Deris
    Onaiza Maqbool
    Jing-peng Li
    Sara Shahzad
    Habib Shah
    Frontiers of Information Technology & Electronic Engineering, 2017, 18 : 1082 - 1107
  • [3] Similarity Measures for Spatial Clustering
    Hamdad, Leila
    Benatchba, Karima
    Ifrez, Soraya
    Mohguen, Yasmine
    COMPUTATIONAL INTELLIGENCE AND ITS APPLICATIONS, 2018, 522 : 25 - 36
  • [4] Clustering of documents via similarity measures
    Rezanková, H
    Húsek, D
    Smid, J
    Snásel, V
    CIC'03: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMMUNICATIONS IN COMPUTING, 2003, : 292 - 299
  • [5] SIMILARITY MEASURES FOR NOMINAL VARIABLE CLUSTERING
    Sulc, Zdenek
    8TH INTERNATIONAL DAYS OF STATISTICS AND ECONOMICS, 2014, : 1536 - 1545
  • [6] An improved Document Clustering Approach with Multi-Viewpoint based on different similarity measures
    Gupta, Anjali
    Dubey, Rahul
    PROCEEDINGS OF THE 2018 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), 2018, : 152 - 157
  • [7] Evaluating similarity measures for software decompositions
    Wen, ZH
    Tzerpos, V
    20TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 2004, : 368 - 377
  • [8] A New Binary Similarity Measure Based on Integration of the Strengths of Existing Measures: Application to Software Clustering
    Naseem, Rashid
    Deris, Mustafa Mat
    RECENT ADVANCES ON SOFT COMPUTING AND DATA MINING, 2017, 549 : 304 - 315
  • [9] Comparison of similarity measures for clustering Turkish documents
    Madylova, Ainura
    Oguducu, Sule Guenduez
    INTELLIGENT DATA ANALYSIS, 2009, 13 (05) : 815 - 832
  • [10] Similarity Measures Recommendation for Mixed Data Clustering
    Diop, Abdoulaye
    El Malki, Nabil
    Chevalier, Max
    Peninou, Andre
    Teste, Olivier
    Jimenez, Geoffrey Roman
    SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT 36TH INTERNATIONAL CONFERENCE, SSDBM 2024, 2024,