Ambiguity in Hispanic names

被引:5
作者
Barcelo, Grettel [1 ]
Cendejas, Eduardo [1 ]
Bolshakov, Igor [1 ]
Sidorov, Grigori [1 ]
机构
[1] Inst Politecn Nacl, Ctr Invest Computac, Mexico City 07738, DF, Mexico
来源
REVISTA SIGNOS | 2009年 / 42卷 / 70期
关键词
Ambiguity; denominative sequence; generative grammar; association; composition;
D O I
10.4067/S0718-09342009000200001
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
The constitution of Hispanic names assumes a degree of ambiguity in many cases. The structure of the denominative sequences in Hispanic countries presents five fundamental problems that obstruct their interpretation: (1) the double sex deduction in personal names, as in Guadalupe; (2) the association of names and/or surnames in one name, as in Jorge Luis, whose components exist separately; (3) the composition of the elements by means of a connector; (4) the name/surname duality; and (5) the accepted omission of some of the elements of the denominative sequences. This study focuses on the automatic detection and analysis of these types of ambiguities (uncertainties). A formal grammar that determines valid interpretations of the nominal chains was developed by means of the automatic labeling of all the elements of which this grammar is composed. Furthermore, graphs of the distribution of the names and surnames are presented, the most important of which reveals that the frequency abides by Zipf's taw. A corpus of 745,084 personal records was used as a data source. From these records, 93,998 type names, and 13,779 type surnames, including simple, compound, and associate ones, were taken. From these, 77,162 (82%) ambiguity sources in names and 2,739 (20%) ambiguity sources in surnames were detected. From all of the personal records analyzed, 241,992 (33%) present at least two valid interpretations in the denomination.
引用
收藏
页码:153 / 169
页数:17
相关论文
共 14 条
  • [1] [Anonymous], P 2 C EMP METH NAT L
  • [2] CASTRO N, 2004, MEMORIAS 5 ENCUENTRO, P289
  • [3] CHEN HH, 1996, P 16 INT C COMP LING, P222
  • [4] Generalized linear least squares method for fast generation of myocardial blood flow parametric images with N-13 ammonia PET
    Chen, KW
    Lawson, M
    Reiman, E
    Cooper, A
    Feng, DG
    Huang, SC
    Bandy, D
    Ho, D
    Yun, LS
    Palant, A
    [J]. IEEE TRANSACTIONS ON MEDICAL IMAGING, 1998, 17 (02) : 236 - 243
  • [5] COATESSTEPHENS S, 1991, P 7 ANN C UW CTR NEW, P154
  • [6] GALICIAHARO S, 2004, LECT NOTES ARTIF INT, P420
  • [7] GELBUKH A, 2001, P 2 INT C COMP LING, P332
  • [8] Gelbukh A., 2006, PROCESAMIENTO AUTOMA
  • [9] Huang F., 2002, P 2 INT C HUM LANG T, P165
  • [10] PAIK W, 1993, P ARPA WORKSH HUM LA, P1