A systematic mapping study on graph machine learning for static source code analysis

被引:0
作者
Maarleveld, Jesse [1 ]
Guo, Jiapan [1 ]
Feitosa, Daniel [1 ]
机构
[1] Univ Groningen, Bernoulli Inst Math Comp Sci & Artificial Intellig, Nijenborgh 9, NL-9747 AG Groningen, Netherlands
关键词
Graph machine learning; Graph neural networks; Static source code analysis; Systematic mapping study; AGREEMENT;
D O I
10.1016/j.infsof.2025.107722
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: In recent years, graph machine learning and particularly graph neural networks have seen successful and widespread applications in many fields, including static source code analysis. Such machine learning techniques enable learning on rich information networks capable of representing different relations and entities. However, there have been no comprehensive studies investigating the use of graph machine learning for static source code analysis. There is no complete systematic picture of what techniques may be considered tried and tested, and where opportunities for future improvements can still be found. Objective: The main goal of this study is to provide a broad overview of the state of the art of static source code analysis using graph machine learning. Methods: A systematic mapping was performed covering 4499 studies, presenting a final selection of 323 primary studies. Results: Among the selected studies, seven major sub-domains were identified. The use and combinations of artefacts, different graph representations, different features, and different machine learning models used were collected and categorised. Conclusions: The use of graph learning, and in particular graph neural networks, has increased significantly since 2018. Although a wide variety of methods is used, across every dimension we investigated (artefacts, graphs, features, models), we found small sets of technologies which are used in the vast majority of studies. Future opportunities lie in exploring under-explored domains more thoroughly, exploring the use of additional artefacts alongside source code, and paying more attention to interpretability and explainability.
引用
收藏
页数:17
相关论文
共 81 条
  • [1] Few-shot training LLMs for project-specific code-summarization
    Ahmed, Toufique
    Devanbu, Premkumar
    [J]. PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022, 2022,
  • [2] Supporting Systematic Literature Reviews Using Deep-Learning-Based Language Models
    Alchokr, Rand
    Borkar, Manoj
    Thotadarya, Sharanya
    Saake, Gunter
    Leich, Thomas
    [J]. 2022 IEEE/ACM 1ST INTERNATIONAL WORKSHOP ON NATURAL LANGUAGE-BASED SOFTWARE ENGINEERING (NLBSE 2022), 2022, : 67 - 74
  • [3] Amara K., 2024, arXiv, DOI [arXiv:2206.09677, 10.48550/arXiv.2206.09677, DOI 10.48550/ARXIV.2206.09677]
  • [4] Identifying, categorizing and mitigating threats to validity in software engineering secondary studies
    Ampatzoglou, Apostolos
    Bibi, Stamatia
    Avgeriou, Paris
    Verbeek, Marijn
    Chatzigeorgiou, Alexander
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2019, 106 : 201 - 230
  • [5] [Anonymous], **DATA OBJECT**, DOI 10.5281/zenodo.14770542
  • [6] Fast unfolding of communities in large networks
    Blondel, Vincent D.
    Guillaume, Jean-Loup
    Lambiotte, Renaud
    Lefebvre, Etienne
    [J]. JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2008,
  • [7] Compiler-Based Graph Representations for Deep Learning Models of Code
    Brauckmann, Alexander
    Goens, Andres
    Ertel, Sebastian
    Castrillon, Jeronimo
    [J]. PROCEEDINGS OF THE 29TH INTERNATIONAL CONFERENCE ON COMPILER CONSTRUCTION (CC '20), 2020, : 201 - 211
  • [8] Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection
    Campello, Ricardo J. G. B.
    Moulavi, Davoud
    Zimek, Arthur
    Sander, Joerg
    [J]. ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2015, 10 (01)
  • [9] BGNN4VD: Constructing Bidirectional Graph Neural-Network for Vulnerability Detection
    Cao, Sicong
    Sun, Xiaobing
    Bo, Lili
    Wei, Ying
    Li, Bin
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2021, 136
  • [10] Untangling Composite Commits by Attributed Graph Clustering
    Chen, Siyu
    Xu, Shengbin
    Yao, Yuan
    Xu, Feng
    [J]. 13TH ASIA-PACIFIC SYMPOSIUM ON INTERNETWARE, INTERNETWARE 2022, 2022, : 117 - 126