State-of-the-art approach to extractive text summarization: a comprehensive review

被引:26
作者
Yadav, Avaneesh Kumar [1 ]
Ranvijay [1 ]
Yadav, Rama Shankar [1 ]
Maurya, Ashish Kumar [1 ]
机构
[1] Motilal Nehru Natl Inst Technol Allahabad, Dept Comp Sci & Engn, Prayagraj, India
关键词
Text summarization; Extractive text summarization; Research issues; Graph-based approaches; Machine learning techniques; Clustering-based approaches; SINGLE-DOCUMENT; EVOLUTIONARY; FRAMEWORK; SYSTEM;
D O I
10.1007/s11042-023-14613-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapid growth of social media platforms, digitization of official records, and digital publication of articles, books, magazines, and newspapers, lots of data are generated every day. This data is a foundation of information and contains a vast amount of text that may be complex, ambiguous, redundant, irrelevant, and unstructured. Therefore, we require tools and methods that can help us understand and automatically summarize the vast amount of generated text. There are mainly two types of approaches to perform text summarization: abstractive and extractive. In Abstractive Text Summarization, a concise summary is generated by including the salient features of the input documents and paraphrasing documents using new sentences and phrases. While in Extractive Text Summarization, a summary is produced by selecting and combining the most significant sentences and phrases from the source documents. The researchers have given numerous techniques for both kinds of text summarization. In this work, we classify Extractive Text Summarization approaches and review them based on their characteristics, techniques, and performance. We have discussed the existing Extractive Text Summarization approaches along with their limitations. We also classify and discuss evaluation measures and provide the research challenges faced in Extractive Text Summarization.
引用
收藏
页码:29135 / 29197
页数:63
相关论文
共 177 条
[1]   A hybrid deep learning architecture for opinion-oriented multi-document summarization based on multi-feature fusion [J].
Abdi, Asad ;
Hasan, Shafaatunnur ;
Shamsuddin, Siti Mariyam ;
Idris, Norisma ;
Piran, Jalil .
KNOWLEDGE-BASED SYSTEMS, 2021, 213
[2]   Query-based multi-documents summarization using linguistic knowledge and content word expansion [J].
Abdi, Asad ;
Idris, Norisma ;
Alguliyev, Rasim M. ;
Aliguliyev, Ramiz M. .
SOFT COMPUTING, 2017, 21 (07) :1785-1801
[3]  
Abhiman BD, 2021, TEXT SUMMARIZATION U
[4]   A Hierarchical Structured Self-Attentive Model for Extractive Document Summarization (HSSAS) [J].
Al-Sabahi, Kamal ;
Zhang Zuping ;
Nadher, Mohammed .
IEEE ACCESS, 2018, 6 :24205-24212
[5]  
Al-Taani AT, 2014, INT ARAB C INFORM TE
[6]   Unsupervised neural networks for automatic Arabic text summarization using document clustering and topic modeling [J].
Alami, Nabil ;
Meknassi, Mohammed ;
En-nahnahi, Noureddine ;
El Adlouni, Yassine ;
Ammor, Ouafae .
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 172
[7]   Hybrid method for text summarization based on statistical and semantic treatment [J].
Alami, Nabil ;
El Mallahi, Mostafa ;
Amakdouf, Hicham ;
Qjidaa, Hassan .
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (13) :19567-19600
[8]   Enhancing unsupervised neural networks based text summarization with word embedding and ensemble learning [J].
Alami, Nabil ;
Meknassi, Mohammed ;
En-nahnahi, Noureddine .
EXPERT SYSTEMS WITH APPLICATIONS, 2019, 123 :195-211
[9]  
Ali ZH, 2021, TELKOMNIKA (Telecommunication Computing Electronics and Control), V19, P89, DOI [10.12928/telkomnika.v19i1.15766, 10.12928/TELKOMNIKA.v19i1.15766, DOI 10.12928/TELKOMNIKA.V19I1.15766]
[10]  
Amarappa S., 2013, INT J ELECT COMPUTER, V2, P281