Graph-Based Text Summarization Using Modified TextRank

被引:61
作者
Mallick, Chirantana [1 ]
Das, Ajit Kumar [1 ]
Dutta, Madhurima [1 ]
Das, Asit Kumar [1 ]
Sarkar, Apurba [1 ]
机构
[1] Indian Inst Engn Sci & Technol, Dept Comp Sci & Technol, Sibpur, Howrah, India
来源
SOFT COMPUTING IN DATA ANALYTICS, SCDA 2018 | 2019年 / 758卷
关键词
Extractive summarization; Single-document source; Sentence index; Similarity graph; PageRank; ROUGE;
D O I
10.1007/978-981-13-0514-6_14
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nowadays, the efficient access of enormous amounts of information has become more difficult due to the rapid growth of the Internet. To manage the vast information, we need efficient and effective methods and tools. In this paper, a graph-based text summarization method has been described which captures the aboutness of a text document. The method has been developed using modified TextRank computed based on the concept of PageRank defined for each page in the Web pages. The proposed method constructs a graph with sentences as the nodes and similarity between two sentences as the weight of the edge between them. Modified inverse sentence frequency-cosine similarity is used to give different weightage to different words in the sentence, whereas traditional cosine similarity treats the words equally. The graph is made sparse and partitioned into different clusters with the assumption that the sentences within a cluster are similar to each other and sentences of different cluster represent their dissimilarity. The performance evaluation of proposed summarization technique shows the effectiveness of the method.
引用
收藏
页码:137 / 146
页数:10
相关论文
共 24 条
[1]  
[Anonymous], 2017, PYTHON 2 7 14 DOCUME
[2]  
[Anonymous], 2004, TEXT SUMMARIZATION B
[3]  
[Anonymous], 2017, BEAUTIFULSOUP DOCUME
[4]  
[Anonymous], ACL
[5]  
Dutta Soumi, 2015, 2015 4th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO) (Trends and Future Directions), P1, DOI 10.1109/ICRITO.2015.7359276
[6]   LexRank: Graph-based lexical centrality as salience in text summarization [J].
Erkan, G ;
Radev, DR .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2004, 22 :457-479
[7]  
Goldstein J., 2000, Workshop on Automatic Summarization at Association for Computational Linguistics, V4, P40
[8]  
Harabagiu S., 2005, SIGIR 2005. Proceedings of the Twenty-Eighth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P202, DOI 10.1145/1076034.1076071
[9]  
Hongyuan Zha, 2002, Proceedings of SIGIR 2002. Twenty-Fifth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P113
[10]  
Kan M. Y., 2001, P ACL EUR WORKSH NAT, P1