Automatic recognition of multi-word terms: The C-value/NC-value method

被引:140
作者
Frantzi K. [1 ]
Ananiadou S. [1 ]
Mima H. [2 ]
机构
[1] Centre for Computational Linguistics, UMIST, Manchester, M60 1QD
[2] Dept. of Information Science, University of Tokyo, Bunkyo-ku, Tokyo 113
关键词
Automatic extraction; Automatic term recognition (ATR); Domain independence; Linguistic and statistical information; Terms;
D O I
10.1007/s007999900023
中图分类号
学科分类号
摘要
Technical terms (henceforth called terms), are important elements for digital libraries. In this paper we present a domain-independent method for the automatic extraction of multi-word terms, from machine-readable special language corpora. The method, (C-value/NC-value), combines linguistic and statistical information. The first part, C-value, enhances the common statistical measure of frequency of occurrence for term extraction, making it sensitive to a particular type of multi-word terms, the nested terms. The second part, NC-value, gives: 1) a method for the extraction of term context words (words that tend to appear with terms); 2) the incorporation of information from term context words to the extraction of terms. © 2000 Springer-Verlag.
引用
收藏
页码:115 / 130
页数:15
相关论文
共 29 条
  • [21] Lauriston A., Automatic Term Recognition: Performance of Linguistic and Statistical Techniques, (1996)
  • [22] Lehrberger J., Sublanguage analysis, Analyzing Language In Restricted Domains, pp. 19-38, (1986)
  • [23] Lipschutz S., Theory and Problems of Probability. Schaum's Outline Series, (1974)
  • [24] Penn treebank annotation, Computational Linguistics, (1993)
  • [25] Rohatgi V.K., An Introduction to Probability Theory and Mathematical Statistics, Wiley Series In Probability and Mathematical Statistics, (1976)
  • [26] Sager J.C., Commentary by Prof. Juan Carlos Sager, Actes Table Ronde Sur Les Problemes Du Decoupage Du Terms, Montreal, pp. 39-74, (1978)
  • [27] Sager J.C., A Practical Course In Terminology Processing, (1990)
  • [28] Sager J.C., Dungworth D., McDonald P.F., English Special Languages: Principles and Practice In Science and Technology, (1980)
  • [29] Salton G., Introduction to modern information retrieval, Computer Science, (1983)