A PROBABILISTIC APPROACH TO MULTI-DOCUMENT SUMMARIZATION FOR GENERATING A TILED SUMMARY

被引:3
作者
Saravanan, M. [1 ]
Raman, S. [1 ]
Ravindran, B. [1 ]
机构
[1] IIT Madras, Dept Comp Sci & Engn, Chennai 600024, Tamil Nadu, India
关键词
Text summarization; probabilistic model; tiled summary;
D O I
10.1142/S1469026806001976
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data availability is not a major issue at present times in view of the widespread use of Internet; however, information and knowledge availability are the issues. Due to data overload and time-critical nature of information need, automatic summarization of documents plays a significant role in information retrieval and text data mining. This paper discusses the design of a multi-document summarizer that uses Katz's K-mixture model for term distribution. The model helps in ranking the sentences by a modified term weight assignment. Highly ranked sentences are selected for the final summary. The sentences that are repetitive in nature are eliminated, and a tiled summary is produced. Our method avoids redundancy and produces a readable (even browsable) summary, which we refer to as an event-specific tiled summary. The system has been evaluated against the frequently occurring sentences in the summaries generated by a set of human subjects. Our system outperforms other auto-summarizers at different extraction levels of summarization with respect to the ideal summary, and is close to the ideal summary at 40% extraction level.
引用
收藏
页码:231 / 243
页数:13
相关论文
共 20 条
[1]  
Barzilay R., 1997, Intelligent Scalable Text Summarization. Proceedings of a Workshop, P10
[2]   Using linear algebra for intelligent information retrieval [J].
Berry, MW ;
Dumais, ST ;
OBrien, GW .
SIAM REVIEW, 1995, 37 (04) :573-595
[3]  
Church K. W., 1995, NATURAL LANG ENG, V1, P163, DOI 10.1017/S1351324900000139
[4]   NEW METHODS IN AUTOMATIC EXTRACTING [J].
EDMUNDSON, HP .
JOURNAL OF THE ACM, 1969, 16 (02) :264-+
[5]  
Harter S., 1975, THESIS
[6]  
Hearst M. A., 1993, SIGIR Forum, P59
[7]  
Jing H., 1998, AAAI INT TEXT SUMM W, P60
[8]  
Johnson N.L., 1969, DISCRETE DISTRIBUTIO
[9]  
Kupiec J., 1995, SIGIR Forum, P68
[10]  
Lin C. Y., 2000, P COLING C STR FRANC, P595