CyNER: Information Extraction from Unstructured Text of CTI Sources with Noncontextual IOCs

被引:8
作者
Fujii, Shota [1 ,2 ]
Kawaguchi, Nobutaka [1 ]
Shigemoto, Tomohiro [1 ]
Yamauchi, Toshihiro [3 ]
机构
[1] Hitachi Ltd, Res & Dev Grp, Yokohama, Kanagawa, Japan
[2] Okayama Univ, Grad Sch Nat Sci & Technol, Okayama, Japan
[3] Okayama Univ, Fac Nat Sci & Technol, Okayama, Japan
来源
ADVANCES IN INFORMATION AND COMPUTER SECURITY, IWSEC 2022 | 2022年 / 13504卷
关键词
Cyber Threat Intelligence; Information Extraction; Named Entity Recognition; Relation Extraction; STIX;
D O I
10.1007/978-3-031-15255-9_5
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Cybersecurity threats have been increasing and growing more sophisticated year by year. In such circumstances, gathering Cyber Threat Intelligence (CTI) and following up with up-to-date threat information is crucial. Structured CTI such as Structured Threat Information eXpression (STIX) is particularly useful because it can automate security operations such as updating FW/IDS rules and analyzing attack trends. However, as most CTIs are written in natural language, manual analysis with domain knowledge is required, which becomes quite time-consuming. In this work, we propose CyNER, a method for automatically structuring CTIs and converting them into STIX format. CyNER extracts named entities in the context of CTI and then extracts the relations between named entities and IOCs in order to convert them into STIX. In addition, by using key phrase extraction, CyNER can extract relations between IOCs that lack contextual information, such as those listed at the bottom of a CTI, and named entities. We describe our design and implementation of CyNER and demonstrate that it can extract named entities with the F-measure of 0.80 and extract relations between named entities and IOCs with the maximum accuracy of 81.6%. Our analysis of structured CTI showed that CyNER can extract IOCs that are not included in existing reputation sites, and that it can automatically extract IOCs that have been exploited for a long time and across multiple attack groups. CyNER is thus expected to contribute to the efficiency of CTI analysis.
引用
收藏
页码:85 / 104
页数:20
相关论文
共 45 条
  • [1] AlienVault, 2021, OP THREAT INT
  • [2] [Anonymous], 2021, OPENCTI
  • [3] Boudin F., 2018, P 2018 C N AM CHAPT, V2, P667, DOI 10.18653/v1/n18-2105
  • [4] Bougouin A., 2013, P 6 INT JOINT C NAT, P543
  • [5] Chalkidis I, 2020, M ASS FOR COMPUTATIO
  • [6] CISA, 2021, AUT IND SHAR AIS
  • [7] DoD, 2021, DEF IND BAS CYB INF
  • [8] Bringing Transparency Design into Practice
    Eiband, Malin
    Schneider, Hanna
    Bilandzic, Mark
    Fazekas-Con, Julian
    Haug, Mareike
    Hussmann, Heinrich
    [J]. IUI 2018: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON INTELLIGENT USER INTERFACES, 2018, : 211 - 223
  • [9] Facebook, 2021, FAC THREATEXCHANGE O
  • [10] Feng X, 2019, PROCEEDINGS OF THE 28TH USENIX SECURITY SYMPOSIUM, P887