A pattern-aware self-attention network for distant supervised relation extraction

被引：30

作者：

Shang, Yu-Ming ^{[1
]}

Huang, Heyan ^{[1
,2
]}

Sun, Xin ^{[1
]}

Wei, Wei ^{[3
]}

Mao, Xian-Ling ^{[1
]}

机构：

[1] Beijing Inst Technol, Sch Comp Sci, Beijing, Peoples R China

[2] Beijing Engn Res Ctr High Volume Language Informa, Beijing, Peoples R China

[3] Huazhong Univ Sci & Technol, Huazhong, Hubei, Peoples R China

来源：

INFORMATION SCIENCES | 2022年 / 584卷

关键词：

distant supervision; relation extraction; pre-trained Transformer; relational pattern; self-attention network;

D O I：

10.1016/j.ins.2021.10.047

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Distant supervised relation extraction is an efficient strategy of finding relational facts from unstructured text without labeled training data. A recent paradigm to develop relation extractors is using pre-trained Transformer language models to produce high-quality sentence representations. However, due to the original Transformer is weak at capturing local dependencies and phrasal structures, existing Transformer-based methods cannot identify various relational patterns in sentences. To address this issue, we propose a novel distant supervised relation extraction model, which employs a specific-designed pattern-aware self-attention network to automatically discover relational patterns for pre-trained Transformers in an end-to-end manner. Specifically, the proposed method assumes that the correlation between two adjacent tokens reflects the probability that they belong to the same pattern. Based on this assumption, a novel self-attention network is designed to generate the probability distribution of all patterns in a sentence. Then, the probability distribution is applied as a constraint in the first Transformer layer to encourage its attention heads to follow the relational pattern structures. As a result, fine-grained pattern information is enhanced in the pre-trained Transformer without losing global dependencies. Extensive experimental results on two popular benchmark datasets demonstrate that our model performs better than the state-of-the-art baselines. (c) 2021 Elsevier Inc. All rights reserved.

引用

页码：269 / 279

页数：11

共 33 条

[1] Alt C, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P1388
[2] Alt Christoph, 2019, 1 C AUT KNOWL BAS CO
[3] [Anonymous], 2018, P 2018 C EMPIRICAL M
[4] Bollacker K., 2008, SIGMOD C, P1247, DOI [10.1145/1376616.1376746, DOI 10.1145/1376616.1376746]
[5] Du JH, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P2216
[6] Feng J, 2018, AAAI CONF ARTIF INTE, P5779
[7] Feng XC, 2017, PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P4002
[8] Semantic relation extraction using sequential and tree-structured LSTM with attention
Geng, ZhiQiang
Chen, GuoFei
Han, YongMing
Lu, Gang
Li, Fang
[J]. INFORMATION SCIENCES, 2020, 509 (509) : 183 - 192
[9] He ZQ, 2018, AAAI CONF ARTIF INTE, P5795
[10] Local-to-global GCN with knowledge-aware representation for distantly supervised relation extraction
Huang, Wenti
Mao, Yiyu
Yang, Liu
Yang, Zhan
Long, Jun
[J]. KNOWLEDGE-BASED SYSTEMS, 2021, 234

← 1 2 3 4 →