Disambiguation of Medical Abbreviations in French with Supervised Methods

被引:3
作者
Koptient, Anais [1 ]
Grabar, Natalia [1 ]
机构
[1] Univ Lille, CNRS, UMR 8163 STL, F-59000 Lille, France
来源
PUBLIC HEALTH AND INFORMATICS, PROCEEDINGS OF MIE 2021 | 2021年 / 281卷
关键词
Word sense disambiguation; Medical domain; Abbreviations; France;
D O I
10.3233/SHTI210171
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Abbreviations are very frequent in medical and health documents but they convey opaque semantics. The association with their expanded forms, like Chronic obstructive pulmonary disease for COPD, may help their understanding. Yet, several abbreviations are ambiguous and have expanded forms possible. We propose to disambiguate the abbreviations in order to associate them with the proper expansion for a given context. We treat the problem through supervised categorization. We create reference data and test several algorithms. The descriptors are collected from lexical and syntactic contexts of abbreviations. The training is done on sentences containing expanded forms of abbreviations. The test is done on corpus built manually, in which the meaning of abbreviations is defined according to their contexts. Our approach shows up to 0.895 F-measure on training data and 0.773 on test data.
引用
收藏
页码:313 / 317
页数:5
相关论文
共 18 条
  • [1] ALICE: An algorithm to extract abbreviations from MEDLINE
    Ao, H
    Takagi, TI
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2005, 12 (05) : 576 - 586
  • [2] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [3] Creating an online dictionary of abbreviations from MEDLINE
    Chang, JT
    Schütze, H
    Altman, RB
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2002, 9 (06) : 612 - 620
  • [4] Grabar N, 2018, WORKSH AUT TEXT AD A, P1
  • [5] Laurent D, 2009, L'analyseur syntaxique Cordial dans Passage
  • [6] Liu HF, 2002, AMIA 2002 SYMPOSIUM, PROCEEDINGS, P464
  • [7] Liu HF, 2001, J AM MED INFORM ASSN, P393
  • [8] Liu Hongfang, 2003, Pac Symp Biocomput, P415
  • [9] Park Y, 2001, PROCEEDINGS OF THE 2001 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, P126
  • [10] Pedregosa F, 2011, J MACH LEARN RES, V12, P2825