Translating medical terminologies through word alignment in parallel text corpora

被引:25
作者
Deleger, Louise [1 ,2 ,3 ]
Merkel, Magnus [4 ]
Zweigenbaum, Pierre [5 ]
机构
[1] INSERM, Ctr Cordeliers, UMR S 872, Eq 20, F-75006 Paris, France
[2] Univ Paris 06, F-75006 Paris, France
[3] Univ Paris 05, F-75006 Paris, France
[4] Linkoping Univ, Dept Comp & Informat Sci, S-58183 Linkoping, Sweden
[5] CNRS, LIMSI, UPR3251, F-91403 Orsay, France
基金
欧盟地平线“2020”;
关键词
Natural Language Processing; Medical terminology; Multilinguality; Parallel corpora; Word alignment;
D O I
10.1016/j.jbi.2009.03.002
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Developing international multilingual terminologies is a time-consuming process. We present a methodology which aims to ease this process by automatically acquiring new translations of medical terms based on word alignment in parallel text corpora, and test it on English and French. After collecting a parallel, English-French corpus, we detected French translations of English terms from three terminologies-MeSH, SNOMED CT and the MedlinePlus Health Topics. We obtained respectively for each terminology 74.8%, 77.8% and 76.3% of linguistically correct new translations. A sample of the MeSH translations was submitted to expert review and 61.5% were deemed desirable additions to the French MeSH. In conclusion, we successfully obtained good quality new translations, which underlines the suitability of using alignment in text corpora to help translating terminologies. Our method may be applied to different European languages and provides a methodological framework that may be used with different processing tools. (C) 2009 Elsevier Inc. All rights reserved.
引用
收藏
页码:692 / 701
页数:10
相关论文
共 30 条
  • [11] CLAVEAU V, 2005, LNCS, V3581
  • [12] Automatic processing of multilingual medical terminology:: applications to thesaurus enrichment and cross-language information retrieval
    Déjean, H
    Gaussier, E
    Renders, JM
    Sadat, F
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2005, 33 (02) : 111 - 124
  • [13] DELEGER L, 2006, P AMIA ANN FALL S WA, P185
  • [14] Deleger L, 2006, STUD HEALTH TECHNOL, V124, P747
  • [15] FOO J, 2009, SERIES TERMINO UNPUB
  • [16] Gale W. A., 1993, Computational Linguistics, V19, P75
  • [17] Gaussier E., 1998, P 36 ANN M ASS COMPU, V1, P444
  • [18] Kay M., 1988, Text-Translation alignment
  • [19] Medical dictionaries for patient encoding systems: a methodology
    Lovis, C
    Baud, R
    Rassinoux, AM
    Michel, PA
    Scherrer, JR
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 1998, 14 (1-2) : 201 - 214
  • [20] Manning Chris, 1999, Foundations of statistical natural language processing