Unified extractive-abstractive summarization: a hybrid approach utilizing BERT and transformer models for enhanced document summarization

被引:1
作者
Divya, S. [1 ]
Sripriya, N. [1 ]
Andrew, J. [2 ]
Mazzara, Manuel [3 ]
机构
[1] SSN Coll Engn, Dept Informat Technol, Kalavakkam, Tamil Nadu, India
[2] Manipal Acad Higher Educ, Dept Comp Sci & Engn, Manipal Inst Technol, Manipal, Karnataka, India
[3] Innopolis Univ, Inst Software Dev & Engn, Innopolis, Russia
关键词
Document summarization; BERT; CNN; Transformer models; Abstractive summarization;
D O I
10.7717/peerj-cs.2424
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the exponential proliferation of digital documents, there arises a pressing need for automated document summarization (ADS). Summarization, a compression technique, condenses a source document into concise sentences that encapsulate its salient information for summary generation. A primary challenge lies in crafting a dependable summary, contingent upon both extracted features and human- established parameters. This article introduces an intelligent methodology that seamlessly integrates extractive and abstractive techniques to ensure heightened relevance between the input document and its summary. Initially, input sentences undergo transformation into representations utilizing BERT, subsequently transposed into a symmetric matrix based on their similarity. Semantically congruent sentences are then extracted from this matrix to construct an extractive summary. The transformer model integrates an objective function highly symmetric and invariant under unitary transformation for language generation. This model refines the extracted informative sentences and generates an abstractive summary akin to manually crafted summaries. Employing this hybrid summarization technique on the CNN/DailyMail dataset and DUC2004, we evaluate its efficacy using ROUGE metrics. Results demonstrate the superiority of our proposed technique over conventional summarization methods.
引用
收藏
页码:1 / 26
页数:26
相关论文
共 19 条
  • [1] A Context-Aware BERT Retrieval Framework Utilizing Abstractive Summarization
    Pan, Min
    Li, Teng
    Yang, Chenghao
    Zhou, Shuting
    Feng, Shaoxiong
    Fang, Youbin
    Li, Xingyu
    2022 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY, WI-IAT, 2022, : 873 - 878
  • [2] Towards a New Hybrid Approach for Abstractive Summarization
    Jaafar, Younes
    Bouzoubaa, Karim
    ARABIC COMPUTATIONAL LINGUISTICS, 2018, 142 : 286 - 293
  • [3] Hie-Transformer: A Hierarchical Hybrid Transformer for Abstractive Article Summarization
    Zhang, Xuewen
    Meng, Kui
    Liu, Gongshen
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT III, 2019, 11955 : 248 - 258
  • [4] Performance Study on Extractive Text Summarization Using BERT Models
    Abdel-Salam, Shehab
    Rafea, Ahmed
    INFORMATION, 2022, 13 (02)
  • [5] Enhanced automatic abstractive document summarization using transformers and sentence grouping
    Toprak, Ahmet
    Turan, Metin
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (04)
  • [6] Genetic Semantic Graph Approach for Multi-document Abstractive Summarization
    Khan, Atif
    Salim, Naomie
    Kumar, Yogan Jaya
    2015 FIFTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION PROCESSING AND COMMUNICATIONS (ICDIPC), 2015, : 173 - 181
  • [7] Knowledge-Enhanced Transformer Graph Summarization (KETGS): Integrating Entity and Discourse Relations for Advanced Extractive Text Summarization
    Onan, Aytug
    Alhumyani, Hesham
    MATHEMATICS, 2024, 12 (23)
  • [8] A Two-Stage Transformer-Based Approach for Variable-Length Abstractive Summarization
    Su, Ming-Hsiang
    Wu, Chung-Hsien
    Cheng, Hao-Tse
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2061 - 2072
  • [9] A Hybrid Solution To Abstractive Multi-Document Summarization Using Supervised and Unsupervised Learning
    Bhagchandani, Gaurav
    Bodra, Deep
    Gangan, Abhishek
    Mulla, Nikahat
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICCS), 2019, : 566 - 570
  • [10] Integrating Topic-Aware Heterogeneous Graph Neural Network With Transformer Model for Medical Scientific Document Abstractive Summarization
    Khaliq, Ayesha
    Khan, Atif
    Awan, Salman Afsar
    Jan, Salman
    Umair, Muhammad
    Zuhairi, Megat F.
    IEEE ACCESS, 2024, 12 : 113855 - 113866