Prior studies generally focus on software vulnerability detection and have demonstrated the effectiveness of Graph Neural Network (GNN)-based approaches for the task. Considering the various types of software vulnerabilities and the associated different degrees of severity, it is also beneficial to determine the type of each vulnerable code for developers. In this paper, we observe that the distribution of vulnerability type is long-tailed in practice, where a small portion of classes have massive samples (i.e., head classes) but the others contain only a few samples (i.e., tail classes). Directly adopting previous vulnerability detection approaches tends to result in poor detection performance, mainly due to two reasons. First, it is difficult to effectively learn the vulnerability representation due to the over-smoothing issue of GNNs. Second, vulnerability types in tails are hard to be predicted due to the extremely few associated samples. To alleviate these issues, we propose a Long-taIled software VulnerABiLity typE classification approach, called LIVABLE. LIVABLE mainly consists of two modules, including (1) vulnerability representation learning module, which improves the propagation steps in GNN to distinguish node representations by a differentiated propagation method. A sequence-to-sequence model is also involved to enhance the vulnerability representations. (2) adaptive re-weighting module, which adjusts the learning weights for different types according to the training epochs and numbers of associated samples by a novel training loss. We verify the effectiveness of LIVABLE in both type classification and vulnerability detection tasks. For vulnerability type classification, the experiments on the Fan et al. dataset show that LIVABLE outperforms the state-of-the-art methods by 24.18% in terms of the accuracy metric, and also improves the performance in predicting tail classes by 7.7%. To evaluate the efficacy of the vulnerability representation learning module in LIVABLE, we further compare it with the recent vulnerability detection approaches on three benchmark datasets, which shows that the proposed representation learning module improves the best baselines by 4.03% on average in terms of accuracy.
机构:
Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R ChinaChinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China
Cui, Jiequan
Liu, Shu
论文数: 0引用数: 0
h-index: 0
机构:
SmartMore, Hong Kong, Peoples R ChinaChinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China
Liu, Shu
Tian, Zhuotao
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R ChinaChinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China
Tian, Zhuotao
Zhong, Zhisheng
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R ChinaChinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China
Zhong, Zhisheng
Jia, Jiaya
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China
SmartMore, Hong Kong, Peoples R ChinaChinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China
机构:
Chinese Acad Sci, Inst Acoust, Beijing 100190, Peoples R China
Univ Chinese Acad Sci, Beijing 100049, Peoples R ChinaChinese Acad Sci, Inst Acoust, Beijing 100190, Peoples R China
Deng, Keqi
Cheng, Gaofeng
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Acad Sci, Beijing 100190, Peoples R ChinaChinese Acad Sci, Inst Acoust, Beijing 100190, Peoples R China
Cheng, Gaofeng
Yang, Runyan
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Acad Sci, Inst Acoust, Beijing 100190, Peoples R China
Univ Chinese Acad Sci, Beijing 100049, Peoples R ChinaChinese Acad Sci, Inst Acoust, Beijing 100190, Peoples R China
Yang, Runyan
Yan, Yonghong
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Acad Sci, Inst Acoust, Beijing 100190, Peoples R China
Univ Chinese Acad Sci, Beijing 100049, Peoples R ChinaChinese Acad Sci, Inst Acoust, Beijing 100190, Peoples R China
机构:
Harbin Inst Technol, Dept Comp Sci & Technol, Shenzhen 518057, Guangdong, Peoples R ChinaHarbin Inst Technol, Dept Comp Sci & Technol, Shenzhen 518057, Guangdong, Peoples R China
Liang, Langzhang
Xu, Zenglin
论文数: 0引用数: 0
h-index: 0
机构:
Harbin Inst Technol, Dept Comp Sci & Technol, Shenzhen 518057, Guangdong, Peoples R ChinaHarbin Inst Technol, Dept Comp Sci & Technol, Shenzhen 518057, Guangdong, Peoples R China
Xu, Zenglin
Song, Zixing
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Univ Hong Kong, Dept Comp Sci & Engn, Shatin, Hong Kong, Peoples R ChinaHarbin Inst Technol, Dept Comp Sci & Technol, Shenzhen 518057, Guangdong, Peoples R China
Song, Zixing
King, Irwin
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Univ Hong Kong, Dept Comp Sci & Engn, Shatin, Hong Kong, Peoples R ChinaHarbin Inst Technol, Dept Comp Sci & Technol, Shenzhen 518057, Guangdong, Peoples R China
King, Irwin
Qi, Yuan
论文数: 0引用数: 0
h-index: 0
机构:
Fudan Univ, Artificial Intelligence Innovat & Incubat Inst, Shanghai 200437, Peoples R ChinaHarbin Inst Technol, Dept Comp Sci & Technol, Shenzhen 518057, Guangdong, Peoples R China
Qi, Yuan
Ye, Jieping
论文数: 0引用数: 0
h-index: 0
机构:
Univ Michigan, Dept Elect Engn & Comp Sci, Ann Arbor, MI 48109 USAHarbin Inst Technol, Dept Comp Sci & Technol, Shenzhen 518057, Guangdong, Peoples R China
机构:
Guizhou Univ, State Key Lab Publ Big Data, Guiyang 550025, Peoples R ChinaGuizhou Univ, State Key Lab Publ Big Data, Guiyang 550025, Peoples R China
Yang, Jinye
Xu, Ji
论文数: 0引用数: 0
h-index: 0
机构:
Guizhou Univ, State Key Lab Publ Big Data, Guiyang 550025, Peoples R ChinaGuizhou Univ, State Key Lab Publ Big Data, Guiyang 550025, Peoples R China
Xu, Ji
Wu, Di
论文数: 0引用数: 0
h-index: 0
机构:
Southwest Univ, Coll Comp & Informat Sci, Chongqing 400715, Peoples R ChinaGuizhou Univ, State Key Lab Publ Big Data, Guiyang 550025, Peoples R China
Wu, Di
Tang, Jianhang
论文数: 0引用数: 0
h-index: 0
机构:
Guizhou Univ, State Key Lab Publ Big Data, Guiyang 550025, Peoples R ChinaGuizhou Univ, State Key Lab Publ Big Data, Guiyang 550025, Peoples R China
Tang, Jianhang
Li, Shaobo
论文数: 0引用数: 0
h-index: 0
机构:
Guizhou Univ, State Key Lab Publ Big Data, Guiyang 550025, Peoples R ChinaGuizhou Univ, State Key Lab Publ Big Data, Guiyang 550025, Peoples R China
Li, Shaobo
Wang, Guoyin
论文数: 0引用数: 0
h-index: 0
机构:
Chongqing Univ Posts & Telecommun, Chongqing Key Lab Computat Intelligence, Chongqing 400065, Peoples R ChinaGuizhou Univ, State Key Lab Publ Big Data, Guiyang 550025, Peoples R China
Wang, Guoyin
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE,
2024,