Automatic generic document summarization based on non-negative matrix factorization

被引:93
作者
Lee, Ju-Hong [1 ]
Park, Sun [2 ]
Ahn, Chan-Min [1 ]
Kim, Daeho [3 ]
机构
[1] Inha Univ, Dept Comp Sci & Informat Engn, Inchon, South Korea
[2] Honam Univ, Dept Comp Engn, Kwangju, South Korea
[3] Inha Univ, Dept Commun & Informat, Inchon, South Korea
关键词
Generic summarization; NMF; LSA; Semantic feature; Semantic variable;
D O I
10.1016/j.ipm.2008.06.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In existing unsupervised methods, Latent Semantic Analysis (LSA) is used for sentence selection. However, the obtained results are less meaningful, because singular vectors are used as the bases for sentence selection from given documents, and singular vector components can have negative values. We propose a new unsupervised method using Non-negative Matrix Factorization (NMF) to select sentences for automatic generic document summarization. The proposed method uses non-negative constraints, which are more similar to the human cognition process. As a result, the method selects more meaningful sentences for generic document summarization than those selected using LSA. (C) 2008 Elsevier Ltd. All rights reserved.
引用
收藏
页码:20 / 34
页数:15
相关论文
共 24 条
[1]  
Amini M.-R., 2002, Proceedings of SIGIR 2002. Twenty-Fifth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P105
[2]  
[Anonymous], 2004, P WORKSH TEXT SUMM B
[3]  
[Anonymous], 2001, Automatic Summarization
[4]  
Baeza-Yates R, 1999, MODERN INFORM RETRIE, V463
[5]  
BUCKLEY C, 1999, P TIPSTER PHAS 3 WOR
[6]  
Chuang W., 2000, Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P152
[7]  
Frankes WB., 1992, Information retrieval: Data structure algorithms
[8]  
Gong Y., 2001, 24th annual international ACM SIGIR conference on Research and development in information retrieval, P19, DOI DOI 10.1145/383952.383955
[9]  
HOA TD, 2005, P DOC UND C DUC 05
[10]   Routing and wavelength assignment in GMPLS networks [J].
Hua, Y ;
Xu, W ;
Wu, CL .
PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PDCAT'2003, PROCEEDINGS, 2003, :268-271