CyNER: Information Extraction from Unstructured Text of CTI Sources with Noncontextual IOCs

被引：8

作者：

Fujii, Shota ^{[1
,2
]}

Kawaguchi, Nobutaka ^{[1
]}

Shigemoto, Tomohiro ^{[1
]}

Yamauchi, Toshihiro ^{[3
]}

机构：

[1] Hitachi Ltd, Res & Dev Grp, Yokohama, Kanagawa, Japan

[2] Okayama Univ, Grad Sch Nat Sci & Technol, Okayama, Japan

[3] Okayama Univ, Fac Nat Sci & Technol, Okayama, Japan

来源：

ADVANCES IN INFORMATION AND COMPUTER SECURITY, IWSEC 2022 | 2022年 / 13504卷

关键词：

Cyber Threat Intelligence; Information Extraction; Named Entity Recognition; Relation Extraction; STIX;

D O I：

10.1007/978-3-031-15255-9_5

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Cybersecurity threats have been increasing and growing more sophisticated year by year. In such circumstances, gathering Cyber Threat Intelligence (CTI) and following up with up-to-date threat information is crucial. Structured CTI such as Structured Threat Information eXpression (STIX) is particularly useful because it can automate security operations such as updating FW/IDS rules and analyzing attack trends. However, as most CTIs are written in natural language, manual analysis with domain knowledge is required, which becomes quite time-consuming. In this work, we propose CyNER, a method for automatically structuring CTIs and converting them into STIX format. CyNER extracts named entities in the context of CTI and then extracts the relations between named entities and IOCs in order to convert them into STIX. In addition, by using key phrase extraction, CyNER can extract relations between IOCs that lack contextual information, such as those listed at the bottom of a CTI, and named entities. We describe our design and implementation of CyNER and demonstrate that it can extract named entities with the F-measure of 0.80 and extract relations between named entities and IOCs with the maximum accuracy of 81.6%. Our analysis of structured CTI showed that CyNER can extract IOCs that are not included in existing reputation sites, and that it can automatically extract IOCs that have been exploited for a long time and across multiple attack groups. CyNER is thus expected to contribute to the efficiency of CTI analysis.

引用

页码：85 / 104

页数：20

共 45 条

[21] Mandiant, 2013, OPENIOC
[22] Cyber Threat Intelligence Model: An Evaluation of Taxonomies, Sharing Standards, and Ontologies within Cyber Threat Intelligence
Mavroeidis, Vasileios
Bromander, Siri
[J]. 2017 EUROPEAN INTELLIGENCE AND SECURITY INFORMATICS CONFERENCE (EISIC), 2017, : 91 - 98
[23] PACE: Pattern Accurate Computationally Efficient Bootstrapping for Timely Discovery of Cyber-Security Concepts
McNeil, Nikki
Bridges, Robert A.
Iannacone, Michael D.
Czejdo, Bogdan
Perez, Nicolas
Goodall, John R.
[J]. 2013 12TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2013), VOL 2, 2013, : 60 - 65
[24] Metcalf Leigh., 2015, P 2 ACM WORKSH INF S, P13, DOI 10.1145/2808128.2808129
[25] Mikolov Tomas, 2013, ARXIV, DOI 10.48550/arXiv.1301.3781
[26] POIROT: Aligning Attack Behavior with Kernel Audit Records for Cyber Threat Hunting
Milajerdi, Sadegh M.
Eshete, Birhanu
Gjomemo, Rigel
Venkatakrishnan, V. N.
[J]. PROCEEDINGS OF THE 2019 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY (CCS'19), 2019, : 1795 - 1812
[27] Min B., 2012, Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, P1027
[28] MISP, 2021, MISP-Open Source Threat Intelligence Platform & Open Standards For Threat Information Sharing
[29] Mittal S, 2016, PROCEEDINGS OF THE 2016 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING ASONAM 2016, P860, DOI 10.1109/ASONAM.2016.7752338
[30] Mulwad V., 2011, 2011 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies, P257, DOI 10.1109/WI-IAT.2011.26

← 1 2 3 4 5 →