Deep Learning-Based Named Entity Recognition and Knowledge Graph Construction for Geological Hazards

被引:66
作者
Fan, Runyu [1 ,2 ]
Wang, Lizhe [1 ,2 ]
Yan, Jining [1 ,2 ]
Song, Weijing [1 ,2 ]
Zhu, Yingqian [1 ,2 ]
Chen, Xiaodao [1 ,2 ]
机构
[1] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China
[2] China Univ Geosci, Hubei Key Lab Intelligent Geoinformat Proc, Wuhan 430074, Peoples R China
基金
中国国家自然科学基金;
关键词
named entity recognition; knowledge graph; deep learning; geological hazards; NEURAL-NETWORKS;
D O I
10.3390/ijgi9010015
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Constructing a knowledge graph of geological hazards literature can facilitate the reuse of geological hazards literature and provide a reference for geological hazard governance. Named entity recognition (NER), as a core technology for constructing a geological hazard knowledge graph, has to face the challenges that named entities in geological hazard literature are diverse in form, ambiguous in semantics, and uncertain in context. This can introduce difficulties in designing practical features during the NER classification. To address the above problem, this paper proposes a deep learning-based NER model; namely, the deep, multi-branch BiGRU-CRF model, which combines a multi-branch bidirectional gated recurrent unit (BiGRU) layer and a conditional random field (CRF) model. In an end-to-end and supervised process, the proposed model automatically learns and transforms features by a multi-branch bidirectional GRU layer and enhances the output with a CRF layer. Besides the deep, multi-branch BiGRU-CRF model, we also proposed a pattern-based corpus construction method to construct the corpus needed for the deep, multi-branch BiGRU-CRF model. Experimental results indicated the proposed deep, multi-branch BiGRU-CRF model outperformed state-of-the-art models. The proposed deep, multi-branch BiGRU-CRF model constructed a large-scale geological hazard literature knowledge graph containing 34,457 entities nodes and 84,561 relations.
引用
收藏
页数:22
相关论文
共 65 条
[1]   Question answering over implicitly structured web content [J].
Agichtein, Eugene ;
Burges, Chris ;
Brill, Eric .
PROCEEDINGS OF THE IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE: WI 2007, 2007, :18-+
[2]  
[Anonymous], 1998, P 7 MESS UND C MUC 7
[3]  
[Anonymous], TUTORIAL NOTES CIKM
[4]  
[Anonymous], 2002, ACL-02 Workshop on Natural Language Processing in the Biomedical Domain, DOI [10.3115/1118149.1118150, DOI 10.3115/1118149.1118150]
[5]  
[Anonymous], 2011, Linked open data: the essentials. A quick start guide for Decisions Makers
[6]  
[Anonymous], 2003, P 7 C NAT LANG LEARN, DOI DOI 10.3115/1119176.1119200
[7]  
[Anonymous], P 7 C MESS UND FRASC
[8]  
[Anonymous], P 7 MESS UND C MUC 7
[9]  
[Anonymous], 2012, On the Difficulty of Training Recurrent Neural Networks, DOI DOI 10.48550/ARXIV.1211.5063
[10]  
[Anonymous], 2010, International Journal of Electrical, Computer, and Systems Engineering