Abstractive text summarization using deep learning with a new Turkish summarization benchmark dataset

被引:4
作者
Ertam, Fatih [1 ]
Aydin, Galip [2 ]
机构
[1] Firat Univ, Technol Fac, Dept Digital Forens Engn, Elazig, Turkey
[2] Firat Univ, Engn Fac, Dept Comp Engn, Elazig, Turkey
关键词
abstract summarization; deep learning; information retrieval; text summarization; web scraping; FRAMEWORK; MODELS;
D O I
10.1002/cpe.6482
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Exponential increase in the amount of textual data made available on the Internet results in new challenges in terms of accessing information accurately and quickly. Text summarization can be defined as reducing the dimensions of the expressions to be summarized without spoiling the meaning. Summarization can be performed as extractive and abstractive or using both together. In this study, we focus on abstractive summarization which can produce more human-like summarization results. For the study we created a Turkish news summarization benchmark dataset from various news agency web portals by crawling the news title, short news, news content, and keywords for the last 5 years. The dataset is made publicly available for researchers. The deep learning network training was carried out by using the news headlines and short news contents from the prepared dataset and then the network was expected to create the news headline as the short news summary. To evaluate the performance of this study, Rouge-1, Rouge-2, and Rouge-L were compared using precision, sensitivity and F1 measure scores. Performance values for the study were presented for each sentence as well as by averaging the results for 50 randomly selected sentences. The F1 Measure values are 0.4317, 0.2194, and 0.4334 for Rouge-1, Rouge-2, and Rouge-L respectively. Performance results show that the approach is promising for Turkish text summarization studies and the prepared dataset will add value to the literature.
引用
收藏
页数:10
相关论文
共 46 条
  • [11] An efficient hybrid deep learning approach for internet security
    Ertam, Fatih
    [J]. PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2019, 535
  • [12] Ertam F, 2017, 2017 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), P755, DOI 10.1109/UBMK.2017.8093521
  • [13] Word-sentence co-ranking for automatic extractive text summarization
    Fang, Changjian
    Mu, Dejun
    Deng, Zhenghong
    Wu, Zhiang
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2017, 72 : 189 - 195
  • [14] GA, MR, FFNN, PNN and GMM based models for automatic text summarization
    Fattah, Mohamed Abdel
    Ren, Fuji
    [J]. COMPUTER SPEECH AND LANGUAGE, 2009, 23 (01) : 126 - 144
  • [15] Recent automatic text summarization techniques: a survey
    Gambhir, Mahak
    Gupta, Vishal
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2017, 47 (01) : 1 - 66
  • [16] Ganesan K., 2010, P 23 INT C COMPUTATI, P340
  • [17] MS-Pointer Network: Abstractive Text Summary Based on Multi-Head Self-Attention
    Guo, Qian
    Huang, Jifeng
    Xiong, Naixue
    Wang, Pan
    [J]. IEEE ACCESS, 2019, 7 : 138603 - 138613
  • [18] Abstractive summarization: An overview of the state of the art
    Gupta, Som
    Gupta, S. K.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 121 : 49 - 65
  • [19] Gupta Vishal, 2010, Journal of Emerging Technologies in Web Intelligence, V2, P258, DOI 10.4304/jetwi.2.3.258-268
  • [20] Abstractive Document Summarization via Neural Model with Joint Attention
    Hou, Liwei
    Hu, Po
    Bei, Chao
    [J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2017, 2018, 10619 : 329 - 338