Named Entity Recognition in Aviation Products Domain Based on BERT

被引:0
作者
Yang, Mingye [1 ]
Namoano, Bernadin [1 ]
Farsi, Maryam [1 ]
Erkoyuncu, John Ahmet [1 ]
机构
[1] Cranfield Univ, Ctr Digital Engn & Mfg, Cranfield MK43 0AL, England
来源
IEEE ACCESS | 2024年 / 12卷
基金
英国工程与自然科学研究理事会;
关键词
Hidden Markov models; Data models; Named entity recognition; Knowledge graphs; Atmospheric modeling; Feature extraction; Data mining; Ontologies; Encoding; Biological system modeling; Aviation; named entity recognition (NER); knowledge graph; bidirectional encoder representations from transformers (BERT); bidirectional long short-term memory network (Bi-LSTM);
D O I
10.1109/ACCESS.2024.3516390
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The aviation products' manufacturing industry is undergoing a profound transformation towards intelligence, among which the construction of a knowledge graph specifically for the aviation field has become the core link in achieving cognitive intelligence. In the process of knowledge graph construction, named entity recognition (NER) is a key step and one of the main tasks of knowledge extraction. Given the high degree of specialisation of aviation product text data and the wide span of contextual information, existing models often perform poorly in entity extraction. This paper proposes a new Named Entity Recognition (NER) method specifically tailored for the aviation product field (BBC-Ap), introducing an innovative approach that leverages domain-specific ontologies and advanced deep learning algorithms to significantly enhance the accuracy and efficiency of entity extraction from complex technical documents. The first step of this method is to establish an ontology model of aviation products and annotate the relevant text data to form a dataset for training the named entity model. Next, it adopts a multi-level model structure based on BERT, in which BERT is used to generate word vector representations, a bidirectional long short-term memory network (BiLSTM) is used as an encoder to extract semantic features, and a conditional random field (CRF) is used as a decoder to achieve optimal label assignment. Through experiments on the constructed aviation product dataset, the model achieved a Precision value of 91.74%, a Recall value of 92.46%, and an F1 score of 92.1%, Compared with other baseline models, the F1-score is improved by 0.9% to 1.5%. At the same time, the model also performs well on standard datasets such as CoNLLpp, with a Precision value of 92.87%, a Recall value of 92.54%, and an F1-Score of 92.70%. Finally, the model was used to successfully construct a knowledge graph reflecting the relationships between aviation products in Neo4j, further demonstrating the effectiveness and practicality of the method.
引用
收藏
页码:189710 / 189721
页数:12
相关论文
共 45 条
  • [1] A systematic literature review of knowledge graph construction and application in education
    Abu-Salih, Bilal
    Alotaibi, Salihah
    [J]. HELIYON, 2024, 10 (03)
  • [2] Named Entity Extraction for Knowledge Graphs: A Literature Overview
    Al-Moslmi, Tareq
    Ocana, Marc Gallofre
    Opdahl, Andreas L.
    Veres, Csaba
    [J]. IEEE ACCESS, 2020, 8 : 32862 - 32881
  • [3] The impact of using different annotation schemes on named entity recognition
    Alshammari, Nasser
    Alanazi, Saad
    [J]. EGYPTIAN INFORMATICS JOURNAL, 2021, 22 (03) : 295 - 302
  • [5] Burke L, 2021, arXiv
  • [6] Staudemeyer RC, 2019, Arxiv, DOI [arXiv:1909.09586, DOI 10.48550/ARXIV.1909.09586]
  • [7] Chiu J.P., 2016, Trans. Assoc. Comput. Linguist., V4, P357, DOI DOI 10.1162/TACLA00104
  • [8] Named entity recognition in aerospace based on multi-feature fusion transformer
    Chu, Jing
    Liu, Yumeng
    Yue, Qi
    Zheng, Zixuan
    Han, Xiaokai
    [J]. SCIENTIFIC REPORTS, 2024, 14 (01)
  • [9] Cobb Adam, 2023, Advances in Neural Information Processing Systems, V36, P44524
  • [10] Deng JF, 2020, Arxiv, DOI [arXiv:2002.00735, 10.48550/arXiv.2002.00735, DOI 10.48550/ARXIV.2002.00735]