Named Entity Recognition of Fresh Egg Supply Chain Based on BERT-CRF Architecture

被引:0
作者
Liu X. [1 ,2 ]
Zhang M. [2 ]
Gu Q. [2 ]
Ren Y. [2 ]
He D. [1 ,3 ]
Gao W. [1 ,3 ]
机构
[1] College of Information and Electrical Engineering, China Agricultural University, Beijing
[2] National Engineering Laboratory for Agri-product Quality Traceability, Beijing Technology and Business University, Beijing
[3] Key Laboratory of Agricultural Informatization Standardization, Ministry of Agriculture and Rural Affairs, Beijing
来源
Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery | 2021年 / 52卷
关键词
Conditional random field; Fresh egg supply chain; Named entity recognition; Pre-training model;
D O I
10.6041/j.issn.1000-1298.2021.S0.066
中图分类号
学科分类号
摘要
Recognizing named entities from raw text is the first step to construct a fresh egg supply chain knowledge graph and support a variety of downstream natural language processing tasks. This task can sort out the information in the supply chain and provide a basis for food safety traceability. In the raw text of fresh egg supply chain, there were various types of entities, and feature information extraction was inefficient. In order to solve the problem of fast and accurate identification of the named entities which entity types were pre-defined, a bidirectional encoder representations from transformers-conditional random field (BERT-CRF) architecture was proposed to solve the task of named entity recognition (NER) in the area of fresh egg supply chain. In BERT-CRF architecture, begin, internal and other (BIO) labeling rule was used to label the sequence, and the concatenation of character vector and position vector was used as inputs. The pre-training language model (BERT) was used to obtain the global features of input sequence, and the CRF layer was added at the end of the model to introduce hard constraints. A comparative experiment was conducted with other three NER model on the self-constructed dataset that contained five categories and 21 subcategories. The result showed that the BERT-CRF model was superior to the others and reported a state-of-the-art performance. The precision, recall and F1-score were 91.82%, 90.44% and 91.01%, respectively. Finally, through the comparative experiments with other self-constructed dataset (dish dataset), the results showed that the model had a certain generalization ability. © 2021, Chinese Society of Agricultural Machinery. All right reserved.
引用
收藏
页码:519 / 525
页数:6
相关论文
共 24 条
[1]  
SONG Baoe, Research on food supply chain safety traceability system, Logistics Engineering and Management, 39, 3, pp. 57-61, (2017)
[2]  
XU Huixin, SHENG Jiping, XU Hong, Research on food circulation database based on supply chain risk management, Journal of Food Safety and Quality, 11, 18, pp. 6475-6481, (2020)
[3]  
LAN Hongjie, HUANG Fengquan, LIN Zikui, Design of food traceability system for 2008 Beijing Olympic Games, China Storage & Transport, 5, pp. 86-89, (2008)
[4]  
GOLSHAN P N, DASHTI H R, AZIZI S., A study of recent contributions on information extraction, (2018)
[5]  
ZHANG Fan, WANG Min, Medical named entity recognition based on deep learning, Computing Technology and Automation, 36, 1, pp. 123-127, (2017)
[6]  
LIU X H, ZHANG S D, WEI F R, Et al., Recognizing named entities in tweets, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 359-367, (2011)
[7]  
LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, Et al., Neural architectures for named entity recognition, (2016)
[8]  
LIU Qiao, LI Yang, DUAN Hong, Et al., Knowledge graph construction techniques, Journal of Computer Research and Development, 53, 3, pp. 582-600, (2016)
[9]  
XU Zenglin, SHENG Yongpan, HE Lirong, Et al., Review on knowledge graph techniques, Journal of University of Electronic Science and Technology of China, 45, 4, pp. 589-606, (2016)
[10]  
GAN Lixin, WAN Changxuan, LIU Dexi, Et al., Chinese named entity relation extraction based on syntactic and semantic features, Journal of Chinese Information Processing, 53, 2, pp. 284-302, (2016)