Constructing a taxonomy to support multi-document summarization of dissertation abstracts

被引:2
作者
Ou S.-Y. [1 ]
Khoo C.S.G. [1 ]
Goh D.H. [1 ]
机构
[1] Division of Information Studies, School of Communication and Information, Nanyang Technological University, 639798, Singapore
来源
Journal of Zhejiang University-SCIENCE A | 2005年 / 6卷 / 11期
关键词
Automatic multi-document summarization; Digital library; Text summarization; Variable-based framework;
D O I
10.1631/jzus.2005.A1258
中图分类号
学科分类号
摘要
This paper reports part of a study to develop a method for automatic multi-document summarization. The current focus is on dissertation abstracts in the field of sociology. The summarization method uses macro-level and micro-level discourse structure to identify important information that can be extracted from dissertation abstracts, and then uses a variable-based framework to integrate and organize extracted information across dissertation abstracts. This framework focuses more on research concepts and their research relationships found in sociology dissertation abstracts and has a hierarchical structure. A taxonomy is constructed to support the summarization process in two ways: (1) helping to identify important concepts and relations expressed in the text, and (2) providing a structure for linking similar concepts in different abstracts. This paper describes the variable-based framework and the summarization process, and then reports the construction of the taxonomy for supporting the summarization process. An example is provided to show how to use the constructed taxonomy to identify important concepts and integrate the concepts extracted from different abstracts.
引用
收藏
页码:1258 / 1267
页数:9
相关论文
共 13 条
[1]  
Endres-Niggemeyer B., Hertenstein B., Villiger C., Ziegert C., Constructing an ontology for WWW summarization in bone marrow transplantation (BMT), (2001)
[2]  
Hovy E., Lin C.Y., Automated text summarization in SUMMARIST, Advances in Automatic Text Summarization, pp. 71-80, (1999)
[3]  
Japanainen P., Jarvinen T., A non-projective dependency parser, Proceedings of the 5th Conference on Applied Natural Language Processing (ANLP), pp. 64-71, (1997)
[4]  
Khoo C., Ou S.Y., Goh D., A hierarchical framework for multi-document summarization of dissertation abstracts, Proceedings of the 5th International Conference on Asian Digital Libraries, pp. 99-110, (2002)
[5]  
Mani I., Bloedorn E., Summarization similarities and differences among related documents, Information Retrieval, 1, 1, pp. 1-23, (1999)
[6]  
McKeown K., Radev R.D., Generating summaries of multiple news articles, Proceedings of the 18th Annual International ACM Conference on Research and Development in Information Retrieval (ACM SIGIR), pp. 74-82, (1995)
[7]  
Medin D.L., Lynch E.B., Solomon K.O., Are there kinds of concepts, Annual Review of Psychology, 51, pp. 149-169, (2000)
[8]  
Guidelines for the construction, format, and management of monolingual thesauri, ANSI/NISO Z39.19-1993, (2003)
[9]  
Ou S.Y., Khoo C., Goh D., Multi-document summarization of dissertation abstracts using a variable-based framework, Proceedings of the 66th Annual Meeting of the American Society for Information Science and Technology, pp. 230-239, (2003)
[10]  
Ou S.Y., Khoo C., Goh D., Heng H.H., Discourse parsing of sociology dissertation abstracts using decision tree induction, Proceedings of the 14th Annual ASIST SIG CR Workshop, (2004)