Reinforcement of BERT with Dependency-Parsing Based Attention Mask

被引：0

作者：

Mechouma, Toufik ^{[1
]}

Biskri, Ismail ^{[2
]}

Meunier, Jean Guy ^{[1
]}

机构：

[1] Univ Quebec Montreal, Montreal, PQ, Canada

[2] Univ Quebec Trois Rivieres, Trois Rivieres, PQ, Canada

来源：

ADVANCES IN COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2022 | 2022年 / 1653卷

关键词：

Bert; Transformers; Attention mechanisms; Dependency parsing;

D O I：

10.1007/978-3-031-16210-7_9

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Dot-Product based attention mechanism is among recent attention mechanisms. It showed an outstanding performance with BERT. In this paper, we propose a dependency-parsing mask to reinforce the padding mask, at the multi-head attention units. Padding mask, is already used to filter padding positions. The proposed mask, aims to improve BERT attention filter. The conducted experiments, show that BERT performs better with the proposed mask.

引用

页码：112 / 122

页数：11

共 9 条

[1] What does BERT look at? An Analysis of BERT's Attention [J].

Clark, Kevin ;

Khandelwal, Urvashi ;

Levy, Omer ;

Manning, Christopher D. .

BLACKBOXNLP WORKSHOP ON ANALYZING AND INTERPRETING NEURAL NETWORKS FOR NLP AT ACL 2019, 2019, :276-286

[2] Bringing Transparency Design into Practice [J].

Eiband, Malin ;

Schneider, Hanna ;

Bilandzic, Mark ;

Fazekas-Con, Julian ;

Haug, Mareike ;

Hussmann, Heinrich .

IUI 2018: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON INTELLIGENT USER INTERFACES, 2018, :211-223

[3]

Graves A., 2012, LONG SHORT TERM MEMO, DOI [10.1007/978-3-642-24797-2-4, DOI 10.1007/978-3-642-24797-2-4]

[4]

Honnibal M., 2017, spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing, DOI DOI 10.3233/978-1-60750-588-4-1080

[5]

Luong M.-T., 2015, LNCS, DOI DOI 10.18653/V1/D15-1166

[6]

Open Sourcing BERT, State-of-the-Art Pre-training for Natural Language Processing-Google AI Blog

[7]

Radford Alec, 2018, Improving language understanding by generative pre-training

[8]

Sak H, 2014, INTERSPEECH, P338

[9]

Vaswani A, 2017, ADV NEUR IN, V30

← 1 →