Automatic Keyword Extraction for Text Summarization in e-Newspapers

被引:10
作者
Thomas, Justine Raju [1 ]
Bharti, Santosh Kumar [1 ]
Babu, Korra Sathya [1 ]
机构
[1] Natl Inst Technol, Dept Comp Sci & Engn, Rourkela 769008, Odisha, India
来源
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATICS AND ANALYTICS (ICIA' 16) | 2016年
关键词
Automatic keyword detection; e-Newspaper; Natural language processing; Text summarization;
D O I
10.1145/2980258.2980442
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Summarization is the process of reducing a text document to create a summary that retains the most important points of the original document. Extractive summarizers work on the given text to extract sentences that best convey the message hidden in the text. Most extractive summarization techniques revolve around the concept of finding keywords and extracting sentences that have more keywords than the rest. Keyword extraction usually is done by extracting relevant words having a higher frequency than others, with stress on important ones'. Manual extraction or annotation of keywords is a tedious process brimming with errors involving lots of manual effort and time. In this paper, we proposed an algorithm to extract keyword automatically for text summarization in e-newspaper datasets. The proposed algorithm is compared with the experimental result of articles having the similar title in four different e-Newspapers to check the similarity and consistency in summarized results.
引用
收藏
页数:8
相关论文
共 28 条
  • [1] [Anonymous], 2008, COLING 2008 P WORKSH, DOI DOI 10.3115/1613172.1613178
  • [2] [Anonymous], 1993, COMPUT LINGUIST, DOI DOI 10.21236/ADA273556
  • [3] [Anonymous], 1999, Advances in automatic text summarization
  • [4] BANKO M, 2004, P 20 INT C COMP LING
  • [5] Barzilay R, 1999, ADVANCES IN AUTOMATIC TEXT SUMMARIZATION, P111
  • [6] CATEGORIZED S. O, 2011, KEYW EXTR BAS SUMM C
  • [7] Chien LF, 1997, PROCEEDINGS OF THE 20TH ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, P50, DOI 10.1145/278459.258534
  • [8] COHEN JD, 1995, J AM SOC INFORM SCI, V46, P162, DOI 10.1002/(SICI)1097-4571(199504)46:3<162::AID-ASI2>3.0.CO
  • [9] 2-6
  • [10] Conroy J. M., 2001, SIGIR Forum, P406