DeepCVA: Automated Commit-level Vulnerability Assessment with Deep Multi-task Learning

被引:46
作者
Triet Huynh Minh Le [1 ]
Hin, David [1 ,2 ]
Croft, Roland [1 ,2 ]
Babar, M. Ali [1 ,2 ]
机构
[1] Univ Adelaide, CREST Ctr Res Engn Software Technol, Adelaide, SA, Australia
[2] Cyber Secur Cooperat Res Ctr, Adelaide, SA, Australia
来源
2021 36TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING ASE 2021 | 2021年
关键词
Software vulnerability; Vulnerability assessment; Deep learning; Multi-task learning; Mining software repositories; Software security;
D O I
10.1109/ASE51524.2021.9678622
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
It is increasingly suggested to identify Software Vulnerabilities (SVs) in code commits to give early warnings about potential security risks. However, there is a lack of effort to assess vulnerability-contributing commits right after they are detected to provide timely information about the exploitability, impact and severity of SVs. Such information is important to plan and prioritize the mitigation for the identified SVs. We propose a novel Deep multi-task learning model, DeepCVA, to automate seven Commit-level Vulnerability Assessment tasks simultaneously based on Common Vulnerability Scoring System (CVSS) metrics. We conduct large-scale experiments on 1,229 vulnerability-contributing commits containing 542 different SVs in 246 real-world software projects to evaluate the effectiveness and efficiency of our model. We show that DeepCVA is the best-performing model with 38% to 59.8% higher Matthews Correlation Coefficient than many supervised and unsupervised baseline models. DeepCVA also requires 6.3 times less training and validation time than seven cumulative assessment models, leading to significantly less model maintenance cost as well. Overall, DeepCVA presents the first effective and efficient solution to automatically assess SVs early in software systems.
引用
收藏
页码:717 / 729
页数:13
相关论文
共 90 条
[1]   code2vec: Learning Distributed Representations of Code [J].
Alon, Uri ;
Zilberstein, Meital ;
Levy, Omer ;
Yahav, Eran .
PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2019, 3 (POPL)
[2]  
[Anonymous], 2010, Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, DOI DOI 10.1145/1835804.1835821
[3]  
[Anonymous], 2015, ACS SYM SER
[4]  
[Anonymous], COMMON VULNERABILITY
[5]  
Authors, REPR PACK
[6]  
Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
[7]  
Bojanowski Piotr, 2017, Transactions of the Association for Computational Linguistics, V5, P135, DOI DOI 10.1162/TACL_A_00051
[8]   Identifying the Characteristics of Vulnerable Code Changes: An Empirical Study [J].
Bosu, Amiangshu ;
Carver, Jeffrey C. ;
Hafiz, Munawar ;
Hilley, Patrick ;
Janni, Derek .
22ND ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (FSE 2014), 2014, :257-268
[9]  
Bosu Amiangshu, 2012, Proceedings of the 4th ACM Workshop on Evaluation and Usability of Programming Language and Tools, P17, DOI [10.1145/2414721.2414726, DOI 10.1145/2414721.2414726]
[10]  
Bullough BL, 2017, PROCEEDINGS OF THE 3RD ACM INTERNATIONAL WORKSHOP ON SECURITY AND PRIVACY ANALYTICS, IWSPA 2017, P45, DOI 10.1145/3041008.3041009