HBert: A Long Text Processing Method Based on BERT and Hierarchical Attention Mechanisms

被引：5

作者：

Lv, Xueqiang ^{[1
]}

Liu, Zhaonan ^{[1
]}

Zhao, Ying ^{[1
]}

Xu, Ge ^{[2
]}

You, Xindong ^{[1
]}

机构：

[1] Beijing Informat Sci & Technol Univ, Beijing, Peoples R China

[2] Minjiang Univ, Fuzhou, Peoples R China

来源：

INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS | 2023年 / 19卷 / 01期

基金：

中国国家自然科学基金;

关键词：

BERT; Hierarchical Attention; Long Text Processing;

D O I：

10.4018/IJSWIS.322769

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With the emergence of a large-scale pre-training model based on the transformer model, the effect of all-natural language processing tasks has been pushed to a new level. However, due to the high complexity of the transformer's self-attention mechanism, these models have poor processing ability for long text. Aiming at solving this problem, a long text processing method named HBert based on Bert and hierarchical attention neural network is proposed. Firstly, the long text is divided into multiple sentences whose vectors are obtained through the word encoder composed of Bert and the word attention layer. And the article vector is obtained through the sentence encoder that is composed of transformer and sentence attention. Then the article vector is used to complete the subsequent tasks. The experimental results show that the proposed HBert method achieves good results in text classification and QA tasks. The F1 value is 95.7% in longer text classification tasks and 75.2% in QA tasks, which are better than the state-of-the-art model longformer.

引用

页数：14

共 16 条

[1] Hierarchical Attentional Hybrid Neural Networks for Document Classification [J].

Abreu, Jader ;

Fred, Luis ;

Macedo, David ;

Zanchettin, Cleber .

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: WORKSHOP AND SPECIAL SESSIONS, 2019, 11731 :396-402

[2]

Adhikari A, 2019, Arxiv, DOI arXiv:1904.08398

[3]

Beltagy I, 2020, Arxiv, DOI arXiv:2004.05150

[4]

[车蕾 Che Lei], 2019, [中文信息学报, Journal of Chinese Information Processing], V33, P93

[5]

Dai ZH, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P2978

[6]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[7]

Kiesel Johannes., 2019, P 13 INT WORKSH SEM, DOI [10.18653/v1/S19-2145, DOI 10.18653/V1/S19-2145]

[8]

Liu YH, 2019, Arxiv, DOI [arXiv:1907.11692, DOI 10.48550/ARXIV.1907.11692]

[9]

Maas A., 2011, P 49 ANN M ASS COMP, P142

[10]

Pappagari R, 2019, 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), P838, DOI [10.1109/asru46091.2019.9003958, 10.1109/ASRU46091.2019.9003958]

← 1 2 →