Abstractive text summarization using deep learning with a new Turkish summarization benchmark dataset

被引：4

作者：

Ertam, Fatih ^{[1
]}

Aydin, Galip ^{[2
]}

机构：

[1] Firat Univ, Technol Fac, Dept Digital Forens Engn, Elazig, Turkey

[2] Firat Univ, Engn Fac, Dept Comp Engn, Elazig, Turkey

来源：

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE | 2022年 / 34卷 / 09期

关键词：

abstract summarization; deep learning; information retrieval; text summarization; web scraping; FRAMEWORK; MODELS;

D O I：

10.1002/cpe.6482

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Exponential increase in the amount of textual data made available on the Internet results in new challenges in terms of accessing information accurately and quickly. Text summarization can be defined as reducing the dimensions of the expressions to be summarized without spoiling the meaning. Summarization can be performed as extractive and abstractive or using both together. In this study, we focus on abstractive summarization which can produce more human-like summarization results. For the study we created a Turkish news summarization benchmark dataset from various news agency web portals by crawling the news title, short news, news content, and keywords for the last 5 years. The dataset is made publicly available for researchers. The deep learning network training was carried out by using the news headlines and short news contents from the prepared dataset and then the network was expected to create the news headline as the short news summary. To evaluate the performance of this study, Rouge-1, Rouge-2, and Rouge-L were compared using precision, sensitivity and F1 measure scores. Performance values for the study were presented for each sentence as well as by averaging the results for 50 randomly selected sentences. The F1 Measure values are 0.4317, 0.2194, and 0.4334 for Rouge-1, Rouge-2, and Rouge-L respectively. Performance results show that the approach is promising for Turkish text summarization studies and the prepared dataset will add value to the literature.

引用

页数：10

共 46 条

[11] An efficient hybrid deep learning approach for internet security
Ertam, Fatih
[J]. PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2019, 535
[12] Ertam F, 2017, 2017 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), P755, DOI 10.1109/UBMK.2017.8093521
[13] Word-sentence co-ranking for automatic extractive text summarization
Fang, Changjian
Mu, Dejun
Deng, Zhenghong
Wu, Zhiang
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2017, 72 : 189 - 195
[14] GA, MR, FFNN, PNN and GMM based models for automatic text summarization
Fattah, Mohamed Abdel
Ren, Fuji
[J]. COMPUTER SPEECH AND LANGUAGE, 2009, 23 (01) : 126 - 144
[15] Recent automatic text summarization techniques: a survey
Gambhir, Mahak
Gupta, Vishal
[J]. ARTIFICIAL INTELLIGENCE REVIEW, 2017, 47 (01) : 1 - 66
[16] Ganesan K., 2010, P 23 INT C COMPUTATI, P340
[17] MS-Pointer Network: Abstractive Text Summary Based on Multi-Head Self-Attention
Guo, Qian
Huang, Jifeng
Xiong, Naixue
Wang, Pan
[J]. IEEE ACCESS, 2019, 7 : 138603 - 138613
[18] Abstractive summarization: An overview of the state of the art
Gupta, Som
Gupta, S. K.
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 121 : 49 - 65
[19] Gupta Vishal, 2010, Journal of Emerging Technologies in Web Intelligence, V2, P258, DOI 10.4304/jetwi.2.3.258-268
[20] Abstractive Document Summarization via Neural Model with Joint Attention
Hou, Liwei
Hu, Po
Bei, Chao
[J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2017, 2018, 10619 : 329 - 338

← 1 2 3 4 5 →