Text mining, names and security

被引：5

作者：

Thompson, P ^{[1
]}

机构：

[1] Dartmouth Coll, Hanover, NH 03755 USA

来源：

JOURNAL OF DATABASE MANAGEMENT | 2005年 / 16卷 / 01期

关键词：

co-reference; multiple hypothesis tracking; name matching; natural language processors;

D O I：

10.4018/jdm.2005010104

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

A Process Query System, a new approach to representing and querying multiple hypotheses, is proposed for cross-document co-reference and linking based on existing entity extraction, co-reference and database name-matching technologies. A crucial component of linking entities across documents is the ability to recognize when different name strings are potential references to the same entity. Given the extraordinary range of variation international names can take when rendered in the Roman alphabet, this is a daunting task. The extension of name variant matching to free text will add important text mining functionality for intelligence and security informatics' toolkits.

引用

页码：54 / 59

页数：6

共 17 条

[1] BAGGA A, 1998, P 36 ANN M ASS COMP, P79
[2] Bagga Amit, 1998, P 1 INT C LANG RES E, P563
[3] BALUJA S, 1999, P PAC ASS COMP LING
[4] An algorithm that learns what's in a name
Bikel, DM
Schwartz, R
Weischedel, RM
[J]. MACHINE LEARNING, 1999, 34 (1-3) : 211 - 231
[5] Borthwick A., 1998, P 7 MESS UND C MUC 7
[6] Collins M, 2002, 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, P489
[7] GRISHMAN R, 1999, P 16 INT C COMP LING
[8] HOLMES D, 2002, P 2002 IEEE INT C IN
[9] JIANG G, 2004, P DEF SEC S
[10] LILJENSTAM M, 2003, SIMULATING REALISTIC

← 1 2 →