Fine-grained smart contract vulnerability detection by heterogeneous code feature learning and automated dataset construction

被引:6
作者
Cai, Jie [1 ,2 ]
Li, Bin [1 ]
Zhang, Tao [2 ]
Zhang, Jiale [1 ]
Sun, Xiaobing [1 ]
机构
[1] Yangzhou Univ, Sch Informat Engn, Yangzhou, Peoples R China
[2] Macau Univ Sci & Technol, Sch Comp Sci & Engn, Macau, Peoples R China
基金
中国国家自然科学基金;
关键词
Smart contract; Static analysis; Vulnerability detection; Graph neural network;
D O I
10.1016/j.jss.2023.111919
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Context: Recently, several deep learning based smart contract vulnerability detection approaches have been proposed. However, challenges still exist in applying deep learning for fine-grained vulnerability detection in smart contracts, including the lack of the dataset with sufficient statement-level labeled smart contract samples and neglect of heterogeneity between syntax and semantic features during code feature learning. Objective: To utilize deep learning for fine-grained smart contract vulnerability detection, we propose a security best practices (SBP) based dataset construction approach to address the scarcity of datasets. Moreover, we propose a syntax-sensitive graph neural network to address the challenge of heterogeneous code feature learning. Method: The dataset construction approach is motivated by the insight that smart contract code fragments guarded by security best practices may contain vulnerabilities in their original unguarded code form. Thus, we locate and strip security best practices from the smart contract code to recover its original vulnerable code form and perform sample labeling. Meanwhile, as the heterogeneity between tree-structured syntax features embodied inside the abstract syntax tree (AST) and graph-structured semantic features reflected by relations between statements, we propose a code graph whose nodes are each statement's AST subtree with a syntax sensitive graph neural network that enhances the graph neural network by a child-sum tree-LSTM cell to learn these heterogeneous features for fine-grained smart contract vulnerability detection. Results: We compare our approach with three state-of-the-art deep learning-based approaches that only support contract-level vulnerability detection and two popular static analysis-based approaches that support fine detection granularity. The experiment results show that our approach outperforms the baselines at both coarse and fine granularities. Conclusion: In this paper, we propose utilizing security best practices inside the smart contract code to construct the dataset with statement-level labels. To learn both tree-structured syntax and graph-structured semantic code features, we propose a syntax-sensitive graph neural network. The experimental results show that our approach outperforms the baselines.
引用
收藏
页数:14
相关论文
共 41 条
[1]   Ethainter: A Smart Contract Security Analyzer for Composite Vulnerabilities [J].
Brent, Lexi ;
Grech, Neville ;
Lagouvardos, Sifis ;
Scholz, Bernhard ;
Smaragdakis, Yannis .
PROCEEDINGS OF THE 41ST ACM SIGPLAN CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION (PLDI '20), 2020, :454-469
[2]  
Brody S, 2022, Arxiv, DOI [arXiv:2105.14491, DOI 10.48550/ARXIV.2105.14491]
[3]   InferCode: Self-Supervised Learning of Code Representations by Predicting Subtrees [J].
Bui, Nghi D. Q. ;
Yu, Yijun ;
Jiang, Lingxiao .
2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021), 2021, :1186-1197
[4]  
Buterin Vitalik, 2018, Tech. Rep.
[5]  
ConsenSys, 2022, Solc-typed-ast
[6]  
ConsenSys, 2021, Mythril
[7]  
Dannen C., 2017, Introducing Ethereum and Solidity, V1
[8]   Empirical Review of Automated Analysis Tools on 47,587 Ethereum Smart Contracts [J].
Durieux, Thomas ;
Ferreira, Joao F. ;
Abreu, Rui ;
Cruz, Pedro .
2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020), 2020, :530-541
[9]  
Falkon S., 2017, The story of the dao-its history and consequences
[10]   Slither: A Static Analysis Framework For Smart Contracts [J].
Feist, Josselin ;
Greico, Gustavo ;
Groce, Alex .
2019 IEEE/ACM 2ND INTERNATIONAL WORKSHOP ON EMERGING TRENDS IN SOFTWARE ENGINEERING FOR BLOCKCHAIN (WETSEB 2019), 2019, :8-15