Feature-Enhanced Document-Level Relation Extraction in Threat Intelligence with Knowledge Distillation

被引:3
作者
Li, Yongfei [1 ]
Guo, Yuanbo [1 ]
Fang, Chen [1 ]
Hu, Yongjin [1 ]
Liu, Yingze [1 ]
Chen, Qingli [1 ]
机构
[1] PLA Informat Engn Univ, Sch Cryptog Engn, Zhengzhou 450001, Peoples R China
基金
中国国家自然科学基金;
关键词
threat intelligence; document-level relation extraction; knowledge distillation; knowledge graph;
D O I
10.3390/electronics11223715
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Relation extraction in the threat intelligence domain plays an important role in mining the internal association between crucial threat elements and constructing a knowledge graph (KG). This study designed a novel document-level relation extraction model, FEDRE-KD, integrating additional features to take full advantage of the information in documents. The study also introduced a teacher-student model, realizing knowledge distillation, to further improve performance. Additionally, a threat intelligence ontology was constructed to standardize the entities and their relationships. To solve the problem of lack of publicly available datasets for threat intelligence, manual annotation was carried out on the documents collected from social blogs, vendor bulletins, and hacking forums. After training the model, we constructed a threat intelligence knowledge graph in Neo4j. Experimental results indicate the effectiveness of additional features and knowledge distillation. Compared to mainstream models SSAN, GAIN, and ATLOP, FEDRE-KD improved the F1score by 22.07, 20.06, and 22.38, respectively.
引用
收藏
页数:13
相关论文
共 31 条
  • [1] Certificateless Aggregated Signcryption Scheme (CLASS) for Cloud-Fog Centric Industry 4.0
    Dohare, Indu
    Singh, Karan
    Ahmadian, Ali
    Mohan, Senthilkumar
    Reddy, Praveen Kumar M.
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (09) : 6349 - 6357
  • [2] A System for Automated Open-Source Threat Intelligence Gathering and Management
    Gao, Peng
    Liu, Xiaoyuan
    Choi, Edward
    Soman, Bhavna
    Mishra, Chinmaya
    Farris, Kate
    Song, Dawn
    [J]. SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, : 2716 - 2720
  • [3] Information Extraction of Cybersecurity Concepts: An LSTM Approach
    Gasmi, Houssem
    Laval, Jannik
    Bouras, Bdelaziz
    [J]. APPLIED SCIENCES-BASEL, 2019, 9 (19):
  • [4] Guo ZJ, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P241
  • [5] Hinton G., 2015, ARXIV, V2
  • [6] Threat intelligence ATT&CK extraction based on the attention transformer hierarchical recurrent neural network
    Liu, Chenjing
    Wang, Junfeng
    Chen, Xiangru
    [J]. APPLIED SOFT COMPUTING, 2022, 122
  • [7] [刘峤 Liu Qiao], 2016, [计算机研究与发展, Journal of Computer Research and Development], V53, P582
  • [8] Long Z., 2019, P 2019 INT JOINT C N, P1, DOI 10.1109/IJCNN.2019.8852142
  • [9] Lv X, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P5694
  • [10] Madhusankha W. A. I., 2020, INTERDISCIPLINARIA A, V22, P20, DOI [DOI 10.24916/IANSA.2017.1.1, 10.9790/487X-2203012028, 10.1007/s10552-022-01648-w, DOI 10.17017/J.FISH.352]