Explaining poor performance of text-based machine learning models for vulnerability detection

被引：0

作者：

Napier, Kollin ^{[1
]}

Bhowmik, Tanmay ^{[2
]}

Chen, Zhiqian ^{[2
]}

机构：

[1] Mississippi Gulf Coast Community Coll, Mississippi Artificial Intelligence Network MAIN, Perkinston, MS 39507 USA

[2] Mississippi State Univ, Dept Comp Sci & Engn, Mississippi State, MS USA

来源：

EMPIRICAL SOFTWARE ENGINEERING | 2024年 / 29卷 / 05期

关键词：

Text-based analysis; Machine learning models; Explainability;

D O I：

10.1007/s10664-024-10519-8

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

With an increase of severity in software vulnerabilities, machine learning models are being adopted to combat this threat. Given the possibilities towards usage of such models, research in this area has introduced various approaches. Although models may differ in performance, there is an overall lack of explainability in understanding how a model learns and predicts. Furthermore, recent research suggests that models perform poorly in detecting vulnerabilities when interpreting source code as text, known as "text-based" models. To help explain this poor performance, we explore the dimensions of explainability. From recent studies on text-based models, we experiment with removal of overlapping features present in training and testing datasets, deemed "cross-cutting". We conduct scenario experiments removing such "cross-cutting" data and reassessing model performance. Based on the results, we examine how removal of these "cross-cutting" features may affect model performance. Our results show that removal of "cross-cutting" features may provide greater performance of models in general, thus leading to explainable dimensions regarding data dependency and agnostic models. Overall, we conclude that model performance can be improved, and explainable aspects of such models can be identified via empirical analysis of the models' performance.

引用

页数：44

共 52 条

[1]

Harer JA, 2018, Arxiv, DOI arXiv:1803.04497

[2]

Alenezi M, 2020, Arxiv, DOI arXiv:2002.07135

[3] A performance evaluation of deep-learnt features for software vulnerability detection [J].

Ban, Xinbo ;

Liu, Shigang ;

Chen, Chao ;

Chua, Caslon .

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (19)

[4]

Bates S., 2017, Methods in behavioral research

[5] Less is More: Supporting Developers in Vulnerability Detection during Code Review [J].

Braz, Larissa ;

Aeberhard, Christian ;

Calikli, Gul ;

Bacchelli, Alberto .

2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), 2022, :1317-1329

[6]

Burkart N, 2021, J ARTIF INTELL RES, V70, P245

[7] Machine Learning Methods for Software Vulnerability Detection [J].

Chernis, Boris ;

Verma, Rakesh .

IWSPA '18: PROCEEDINGS OF THE FOURTH ACM INTERNATIONAL WORKSHOP ON SECURITY AND PRIVACY ANALYTICS, 2018, :31-39

[8] Code Reviews Do Not Find Bugs How the Current Code Review Best Practice Slows Us Down [J].

Czerwonka, Jacek ;

Greiler, Michaela ;

Tilford, Jack .

2015 IEEE/ACM 37TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, VOL 2, 2015, :27-28

[9]

Duval A., 2019, Explainable artificial intelligence (xai), P53, DOI DOI 10.13140/RG.2.2.24722.09929

[10]

Edmundson Anne, 2013, Engineering Secure Software and Systems. 5th International Symposium, ESSoS 2013. Proceedings, P197, DOI 10.1007/978-3-642-36563-8_14

← 1 2 3 4 5 6 →