Combining lexical and structural information to reconstruct software layers

被引:18
作者
Belle, Alvine Boaye [1 ]
El Boussaidi, Ghizlane [1 ]
Kpodjedo, Segla [1 ]
机构
[1] Ecole Technol Super, Dept Software & IT Engn, Montreal, PQ, Canada
关键词
Software maintenance; Reverse engineering; Architecture recovery; Layering style; Hill climbing; Latent Dirichlet Allocation; ARCHITECTURE;
D O I
10.1016/j.infsof.2016.01.008
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: The architectures of existing software systems generally lack documentation or have often drifted from their initial design due to repetitive maintenance operations. To evolve such systems, it is mandatory to reconstruct and document their architectures. Many approaches were proposed to support the architecture recovery process but few of these consider the architectural style of the system under analysis. Moreover, most of existing approaches rely on structural dependencies between entities of the system and do not exploit the semantic information hidden in the source code of these entities. Objective: We propose an approach that exploits both linguistic and structural information to recover the software architecture of Object Oriented (OO) systems. The focus of this paper is the recovery of architectures that comply with the layered style, which is widely used in software systems. Method: In this work, we (i) recover the responsibilities of the system under study and (ii) assign these responsibilities to different abstraction layers. To do so, we use the linguistic information extracted from the source code to recover clusters corresponding to the responsibilities of the system. Then we assign these clusters to layers using the system's structural information and the layered style constraints. We formulate the recovery of the responsibilities and their assignment to layers as optimization problems that we solve using search-based algorithms. Results: To assess the effectiveness of our approach we conducted experiments on four open source systems. The so-obtained layering results yielded higher precision and recall than those generated using a structural-based layering approach. Conclusion: Our hybrid lexical-structural approach is effective and shows potential for significant improvement over techniques based only on structural information. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 16
页数:16
相关论文
共 49 条
  • [1] Clustering large software systems at multiple layers
    Andreopoulos, Bill
    An, Aijun
    Tzerpos, Vassillos
    Wang, Xiaogang
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2007, 49 (03) : 244 - 254
  • [2] [Anonymous], 2007, Probabilistic Topic Models
  • [3] [Anonymous], P 15 ACM SIGKDD INT
  • [4] [Anonymous], 1999, MODERN INFORM RETRIE
  • [5] [Anonymous], IEEE TSE
  • [6] Anquetil N, 1999, J SOFTW MAINT-RES PR, V11, P201, DOI 10.1002/(SICI)1096-908X(199905/06)11:3<201::AID-SMR192>3.0.CO
  • [7] 2-1
  • [8] Learning from optimization: A case study with Apache Ant
    Barros, Marcio de Oliveira
    Farzat, Fabio de Almeida
    Travassos, Guilherme Horta
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2015, 57 : 684 - 704
  • [9] Using structural and semantic measures to improve software modularization
    Bavota, Gabriele
    De Lucia, Andrea
    Marcus, Andrian
    Oliveto, Rocco
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2013, 18 (05) : 901 - 932
  • [10] Belle Alvine Boaye, 2013, SEKE, V1, P344