Enhancing software modularization via semantic outliers filtration and label propagation

被引:4
作者
Yang, Kaiyuan [1 ]
Wang, Junfeng [1 ]
Fang, Zhiyang [2 ]
Wu, Peng [3 ]
Song, Zihua [2 ]
机构
[1] Sichuan Univ, Coll Comp Sci, Chengdu 610065, Peoples R China
[2] Sichuan Univ, Sch Cyber Sci & Engn, Chengdu 610207, Peoples R China
[3] Sichuan Tourism Univ, Sch Informat & Engn, Chengdu 610100, Peoples R China
基金
中国国家自然科学基金;
关键词
Software modularization; Software clustering; Software maintenance; Semantic outliers; LEXICAL INFORMATION;
D O I
10.1016/j.infsof.2021.106818
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Software systems' modular structure often drifts from the intended design throughout evolution. To improve the modular structure of a software system, the software clustering technology aiming to partition a software system into meaningful modules is demanding. Many clustering approaches rely on semantic information, which cluster software entities that use similar vocabulary. However, the existence of semantic outliers obstructing the clustering process is hardly considered.Objective: To overcome the existence of semantic outliers, this paper proposes a two-stage software clustering approach named EVOL (Enhancing Via Outliers filtration and Label propagation).Methods: A feature density-based outliers detecting algorithm is used to compute the local outlier factor of each feature. Accordingly, we filter out the semantic outliers and cluster remaining high-quality features to construct a partition skeleton; After that, assign each outlier into a suitable cluster by label propagation.Results: To assess the effectiveness of the proposed approach, this paper conducts experiments on six folders from Mozilla Firefox and other four software systems, referring to the original design concepts and modular structure provided by the developers. The average of the evaluation metric MoJoFM shows significant improvement from 6% to 35% over the other six state-of-art clustering techniques. The results demonstrate that the filtration of the outliers facilitates the clustering results, and label propagation could place the outliers into a suitable cluster.Conclusion: In this paper, we propose EVOL, a new software clustering approach that considers semantic outliers filtration and label propagation. The experiment results show that the proposed approach EVOL can be very useful to enhance the quality of the software modularization.
引用
收藏
页数:11
相关论文
共 45 条
  • [1] A New Metaheuristic-Based Hierarchical Clustering Algorithm for Software Modularization
    Aghdasifam, Masoud
    Izadkhah, Habib
    Isazadeh, Ayaz
    [J]. COMPLEXITY, 2020, 2020
  • [2] Improving modular structure of software system using structural and lexical dependency
    Amarjeet
    Chhabra, Jitender Kumar
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2017, 82 : 96 - 120
  • [3] Information-theoretic software clustering
    Andritsos, P
    Tzerpos, V
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2005, 31 (02) : 150 - 165
  • [4] An Analysis of the Effects of Composite Objectives in Multiobjective Software Module Clustering
    Barros, Marcio de O.
    [J]. PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2012, : 1205 - 1212
  • [5] Bavota G., 2010, Proceedings 17th Working Conference on Reverse Engineering (WCRE 2010), P195, DOI 10.1109/WCRE.2010.29
  • [6] Improving Software Modularization via Automated Analysis of Latent Topics and Dependencies
    Bavota, Gabriele
    Gethers, Malcom
    Oliveto, Rocco
    Poshyvanyk, Denys
    De Lucia, Andrea
    [J]. ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2014, 23 (01)
  • [7] A large-scale study of architectural evolution in open-source software systems
    Behnamghader, Pooyan
    Duc Minh Le
    Garcia, Joshua
    Link, Daniel
    Shahbazian, Arman
    Medvidovic, Nenad
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2017, 22 (03) : 1146 - 1193
  • [8] Bianchi A., 2001, SOFTW METR S 2001 ME, P210, DOI [DOI 10.1109/METRIC.2001.915530, 10.1109/metric.2001.915530]
  • [9] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
  • [10] LOF: Identifying density-based local outliers
    Breunig, MM
    Kriegel, HP
    Ng, RT
    Sander, J
    [J]. SIGMOD RECORD, 2000, 29 (02) : 93 - 104