Identifying important concepts from medical documents

被引:32
作者
Li, Quanzhi [1 ]
Wu, Yi-Fang Brook [1 ]
机构
[1] New Jersey Inst Technol, Dept Informat Syst, Newark, NJ 07102 USA
基金
美国国家科学基金会;
关键词
noun phrase extraction; keyphrase extraction; medical documents; medical concepts; document keyphrase; text mining;
D O I
10.1016/j.jbi.2006.02.001
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Automated medical concept recognition is important for medical informatics such as medical document retrieval and text mining research. In this paper, we present a software tool called keyphrase identification program (KIP) for identifying topical concepts from medical documents. KIP combines two functions: noun phrase extraction and keyphrase identification. The former automatically extracts noun phrases from medical literature as keyphrase candidates. The latter assigns weights to extracted noun phrases for a medical document based on how important they are to that document and how domain specific they are in the medical domain. The experimental results show that our noun phrase extractor is effective in identifying noun phrases from medical documents, so is the keyphrase extractor in identifying important medical conceptual terms. They both performed better than the systems they were compared to. (c) 2006 Elsevier Inc. All rights reserved.
引用
收藏
页码:668 / 679
页数:12
相关论文
共 54 条
  • [1] [Anonymous], P 3 C APPL NAT LANG
  • [2] Aronson A. R., 1994, P RIAO, V1, P197
  • [3] Aronson AR, 2001, J AM MED INFORM ASSN, P17
  • [4] Aronson AR, 2000, J AM MED INFORM ASSN, P17
  • [5] Bennett NA, 1999, J AM MED INFORM ASSN, P671
  • [6] BLAKE C, 2002, AAAI S KNOWL BAS APP
  • [7] Blaschke C, 1999, Proc Int Conf Intell Syst Mol Biol, P60
  • [8] BODENREIDER O, 2002, P WORKSH NAT LANG PR, P53, DOI DOI 10.3115/1118149.1118157
  • [9] Brill E, 1995, COMPUT LINGUIST, V21, P543
  • [10] Chen HC, 1997, J AM SOC INFORM SCI, V48, P17, DOI 10.1002/(SICI)1097-4571(199701)48:1<17::AID-ASI4>3.0.CO