Smart Contract Vulnerability Detection Based on Code Graph Embedding Approach

被引:0
作者
Zhai, Yiwen [1 ]
Yang, Jia [1 ]
Zhang, Mingwu [1 ,2 ]
机构
[1] Hubei Univ Technol, Sch Comp Sci, Wuhan 430068, Peoples R China
[2] Guilin Univ Elect Technol, Sch Comp Sci & Informat Secur, Guilin 541004, Peoples R China
来源
FRONTIERS IN CYBER SECURITY, FCS 2024, PT I | 2024年 / 2315卷
基金
中国国家自然科学基金;
关键词
smart contracts; contract graph; gated graph neural network; vulnerability detection; deep learning;
D O I
10.1007/978-981-96-0151-6_21
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The security of smart contracts has attracted widespread attention due to the huge property losses caused by vulnerabilities. However, existing program analysis techniques usually have high false positives, resulting in low detection accuracy. In this paper, we propose a smart contract vulnerability detection method based on code graph embedding. We transform key functions and dependencies of key functions related to vulnerabilities in the source code into graph features, which in turn serve as vulnerability features for smart contracts. Specifically, we first filter out the vulnerability-related functions in the source code, namely the key functions. Then, we utilize the word2vec model to perform graph embedding operations on the key functions to compress the whole function into vector form. Second, we generate call graphs and data dependency graphs for each target smart contract. For simplicity, only the key function nodes in these two graphs are retained. The two graphs are simplified and merged into one total code graph. Finally, the vectors transformed from the source code of the function are used as node features of the code graph. The final code graph is fed into the model for learning. The GGNN model learns the graph features to detect the presence of vulnerabilities. Existing deep learning-based vulnerability prediction methods also face challenges concerning training data, such as data duplication and unrealistic distribution of vulnerability classes. Therefore, we balanced the adopted dataset using a synthetic minority oversampling technique. The experimental results show that our model has 5% higher precision and 7% higher recall compared to the best-performing model in the literature.
引用
收藏
页码:317 / 332
页数:16
相关论文
共 21 条
[1]   Eth2Vec: Learning contract-wide code representations for vulnerability detection on Ethereum smart contracts [J].
Ashizawa, Nami ;
Yanai, Naoto ;
Cruz, Jason Paul ;
Okamura, Shingo .
BLOCKCHAIN-RESEARCH AND APPLICATIONS, 2022, 3 (04)
[2]   Combine sliced joint graph with graph neural networks for smart contract vulnerability detection? [J].
Cai, Jie ;
Li, Bin ;
Zhang, Jiale ;
Sun, Xiaobing ;
Chen, Bing .
JOURNAL OF SYSTEMS AND SOFTWARE, 2023, 195
[3]   Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN [J].
Chen, Tao ;
Xu, Ruifeng ;
He, Yulan ;
Wang, Xuan .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 72 :221-230
[4]   Slither: A Static Analysis Framework For Smart Contracts [J].
Feist, Josselin ;
Greico, Gustavo ;
Groce, Alex .
2019 IEEE/ACM 2ND INTERNATIONAL WORKSHOP ON EMERGING TRENDS IN SOFTWARE ENGINEERING FOR BLOCKCHAIN (WETSEB 2019), 2019, :8-15
[5]   SmartBugs: A Framework to Analyze Solidity Smart Contracts [J].
Ferreira, Joao F. ;
Cruz, Pedro ;
Durieux, Thomas ;
Abreu, Rui .
2020 35TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE 2020), 2020, :1349-1352
[6]   How Effective Are Smart Contract Analysis Tools? Evaluating Smart Contract Static Analysis Tools using Bug Injection [J].
Ghaleb, Asem ;
Pattabiraman, Karthik .
PROCEEDINGS OF THE 29TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2020, 2020, :415-427
[7]   LSTM: A Search Space Odyssey [J].
Greff, Klaus ;
Srivastava, Rupesh K. ;
Koutnik, Jan ;
Steunebrink, Bas R. ;
Schmidhuber, Juergen .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (10) :2222-2232
[8]   ContractFuzzer: Fuzzing Smart Contracts for Vulnerability Detection [J].
Jiang, Bo ;
Liu, Ye ;
Chan, W. C. .
PROCEEDINGS OF THE 2018 33RD IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMTED SOFTWARE ENGINEERING (ASE' 18), 2018, :259-269
[9]  
Tann WJW, 2019, Arxiv, DOI arXiv:1811.06632
[10]  
Kipf ThomasN., 2016, INT C LEARN REPR, DOI DOI 10.48550/ARXIV.1609.02907