Entity relationship extraction and correlation analysis of agricultural product standard domain knowledge graph

被引:0
作者
Lyu D. [1 ,2 ]
Chen J. [3 ]
Mao D. [1 ,2 ]
Zhang Q. [1 ]
Zhao M. [1 ,2 ]
Hao Z. [1 ,4 ]
机构
[1] National Engineering Laboratory for Agri-product Quality Traceability, Beijing Technology and Business University, Beijing
[2] Beijing Key Laboratory of Big Data Technology for Food Safety, Beijing Technology and Business University, Beijing
[3] Institute of Standardization Theory and Strategy, China National Institute of Standardization, Beijing
[4] State Key Laboratory of Internet of Things for Smart City, Faculty of Science and Technology, University of Macau
来源
Nongye Gongcheng Xuebao/Transactions of the Chinese Society of Agricultural Engineering | 2022年 / 38卷 / 09期
关键词
agricultural product standard; community mining; dependency parsing; knowledge graph; relation extraction;
D O I
10.11975/j.issn.1002-6819.2022.09.035
中图分类号
学科分类号
摘要
Agricultural product standards can be used to support agricultural product safety and supervision in recent years. Nevertheless, the related terms of agricultural product standards are too decentralized and isolated from each other without any systematic correlation and reuse at present. Knowledge graphs can connect the various types of information together to form a network, thus analyzing from a "relational" perspective. This study aims to design the ontology rules for the agricultural standard information using the drafting specifications of standardized documents and relevant Baidu encyclopedia entry data. A suitable regular wrapper was also designed for the semi-structured data. Better performance was achieved to extract the standard document information, with the accuracy and F1 indexes above 95%. At the same time, an open relationship extraction model was established in the agricultural products field (OREM-AF) for the unstructured data using dependency parsing. This model was used to first learn the dependency structure between entity pairs for the triple labels of the training corpus, and further generate the entity relationship extraction paradigm logical expressions. After all the training corpus was learned, the test corpus was analyzed by the dependency syntax to obtain the core vocabulary chain of the corpus. Then, the substructure tree with the core vocabulary was taken as the root node for the corresponding entity pairs and relationships by matching the learned entity relationship dependency structure paradigm set for the corresponding triple. Finally, the automatic extraction of agricultural products was realized the related information triple. The experimental results show that the OREM-AF presented a 74.22% accuracy and 75.12% F1 value on the agricultural product data set, while the 84.51% accuracy and 75.43% F1 value on the common data set. The extraction performed better using dependency parsing, due to the active learning and fine-grained sibling substitution, compared with the other models. It infers that the active learning capability led to the strong migration. Relying on the neo4j graph database storage, a knowledge map was constructed in the field of agricultural standards, which clearly and quickly captured the links to information that needs to be retrieved, thus providing supplementary analytical support for the regulation of agricultural products. The community mining was carried out in the network of agricultural standards using the Leiden algorithm. It was found that the GB 2 762, and GB 2 763 agricultural standards were in the same community belonging to the National Food Safety Standard, indicating that the agricultural field was attached the great importance to the pesticide and contaminant residues in agricultural products. Most GB 5009 series standards belonging to the same community were basically physical and chemical indicators for the agricultural products related to the health inspection methods, of which several indicators with the higher references were the total mercury and organic mercury, total arsenic and inorganic arsenic, total lead, and organic phosphorus pesticide residues. Most references of GB 14881 were the local standards, indicating that the preparation of local standards to guidelines was related to the raw material purchase, processing, packaging, and storage steps in the production process of agricultural products. © 2022 Chinese Society of Agricultural Engineering. All rights reserved.
引用
收藏
页码:315 / 323
页数:8
相关论文
共 35 条
[1]  
Ji S, Pan S, Cambria E, Et al., A survey on knowledge graphs: Representation, acquisition, and applications, IEEE Transactions on Neural Networks and Learning Systems, 33, 2, pp. 494-514, (2021)
[2]  
Wang Xin, Zou Lei, Wang Chaokun, Et al., Research on knowledge graph data management: A Survey, Journal of Software, 30, 7, pp. 2139-2174, (2019)
[3]  
Yang Yuji, Xu Bin, Hu Jiawei, Et al., Accurate and efficient method for constructing domain knowledge graph, Journal of Software, 29, 10, pp. 2931-2947, (2018)
[4]  
Chen Y, Kuang J, Cheng D, Et al., AgriKG: An agricultural knowledge graph and its applications, International Conference on Database Systems for Advanced Applications, pp. 533-537, (2019)
[5]  
Wu Saisai, Zhou Ailian, Xie Nengfu, Et al., Construction of visualization domain-specific knowledge graph of crop diseases and pests based on deep learning, Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 36, 24, pp. 177-185, (2020)
[6]  
Zhang Shanwen, Wang Zhen, Wang Zuliang, Prediction of wheat stripe rust disease by combining knowledge graph and bidirectional long short term memory network, Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 36, 12, pp. 172-178, (2020)
[7]  
Qin L, Hao Z, Zhao L., Food safety knowledge graph and question answering system, Proceedings of the 2019 7th International Conference on Information Technology: IoT and Smart City, pp. 559-564, (2019)
[8]  
Qin L, Hao Z, Yang L P., Question answering system based on food spot-check knowledge graph, Proceedings of 2020 the 6th International Conference on Computing and Data Engineering, pp. 168-172, (2020)
[9]  
Qin Li, Hao Zhigang, Li Guoliang, Construction and correlation analysis of national food safety standard graph, Journal of Computer Applications, 41, 4, pp. 1005-1011, (2021)
[10]  
Ren Feiliang, Shen Jikun, Sun Binbin, Et al., A Review for domain ontology construction from text, Chinese Journal of Computers, 42, 3, pp. 654-676, (2019)