Summarization of scholarly articles using BERT and BiGRU: Deep learning-based extractive approach

被引:41
作者
Bano, Sheher [1 ]
Khalid, Shah [1 ]
Tairan, Nasser Mansoor [2 ]
Shah, Habib [2 ]
Khattak, Hasan Ali [1 ]
机构
[1] Natl Univ Sci & Technol NUST, H12, Islamabad 44000, Pakistan
[2] King Khalid Univ, Coll Comp Sci, Dept, Abha, Saudi Arabia
关键词
Text summarization; Attention mechanism; BERT; BiGRU;
D O I
10.1016/j.jksuci.2023.101739
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Extractive text summarization involves selecting and combining key sentences directly from the original text, rather than generating new content. While various methods, both statistical and graph-based, have been explored for this purpose, accurately capturing the intended meaning remains a challenge. To address this, researchers are investigating innovative techniques that harness deep learning models like BERT (Bidirectional Encoder Representations from Transformers). However, BERT has limitations in summarizing lengthy documents due to input length constraints. To find a more effective solution, we propose a novel approach. This approach combines the power of BERT, a transformer network pre-trained on extensive self-supervised datasets, with BiGRU (Bidirectional Gated Recurrent Units), a recurrent neural network that captures sequential dependencies within the text for extracting salient information. Our method involves using BERT to generate sentence-level embeddings, which are then fed into the BiGRU network. This allows us to achieve a comprehensive understanding of the complete document's context. In experimental analysis conducted on arXiv and PubMed datasets, the proposed approach outperformed several state-of-the-art models. It achieved remarkable ROUGE-F scores of (46.7, 19.4, 35.4) and (47.0, 21.3, 39.7) on these datasets respectively. The proposed fusion of BERT and BiGRU significantly enhances extractive text summarization. It shows promising potential for summarizing lengthy documents and benefiting various domains that require concise and informative summaries. (c) 2023 The Author(s). Published by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
引用
收藏
页数:11
相关论文
共 65 条
[1]   Indonesian Abstractive Text Summarization Using Bidirectional Gated Recurrent Unit [J].
Adelia, Rike ;
Suyanto, Suyanto ;
Wisesty, Untari Novia .
4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND COMPUTATIONAL INTELLIGENCE (ICCSCI 2019) : ENABLING COLLABORATION TO ESCALATE IMPACT OF RESEARCH RESULTS FOR SOCIETY, 2019, 157 :581-588
[2]  
Agrawal Ayush., 2014, International Journal of Scientific and Research Publications, V4, P1
[3]   Automatic summarization of scientific articles: A survey [J].
Altmami, Nouf Ibrahim ;
Menai, Mohamed El Bachir .
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (04) :1011-1028
[4]  
Bano Sheher, 2022, 2022 International Conference on Artificial Intelligence of Things (ICAIoT), P1, DOI 10.1109/ICAIoT57170.2022.10121826
[5]  
Bi KP, 2021, Arxiv, DOI [arXiv:2004.06176, 10.48550/arXiv.2004.06176]
[6]  
Cheng JP, 2016, PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, P484
[7]  
[程艳 Cheng Yan], 2020, [计算机研究与发展, Journal of Computer Research and Development], V57, P2583
[8]  
Cho KYHY, 2014, Arxiv, DOI arXiv:1409.1259
[9]  
Cohan A, 2018, Arxiv, DOI arXiv:1804.05685
[10]  
Devlin J, 2019, Arxiv, DOI [arXiv:1810.04805, 10.48550/arxiv.1810.04805]