DICE: Data-Efficient Clinical Event Extraction with Generative Models

被引:0
作者
Ma, Mingyu Derek [1 ]
Taylor, Alexander K. [1 ]
Wang, Wei [1 ]
Peng, Nanyun [1 ]
机构
[1] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90024 USA
来源
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1 | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Event extraction for the clinical domain is an under-explored research area. The lack of training data along with the high volume of domain-specific terminologies with vague entity boundaries makes the task especially challenging. In this paper, we introduce DICE, a robust and data-efficient generative model for clinical event extraction. DICE frames event extraction as a conditional generation problem and introduces a contrastive learning objective to accurately decide the boundaries of biomedical mentions. DICE also trains an auxiliary mention identification task jointly with event extraction tasks to better identify entity mention boundaries, and further introduces special markers to incorporate identified entity mentions as trigger and argument candidates for their respective tasks. To benchmark clinical event extraction, we compose MACCROBAT-EE, the first clinical event extraction dataset with argument annotation, based on an existing clinical information extraction dataset, MACCROBAT (Caufield et al., 2019). Our experiments demonstrate state-of-the-art performances of DICE for clinical and news domain event extraction, especially under low data settings.
引用
收藏
页码:15898 / 15917
页数:20
相关论文
共 58 条
[1]  
Ahn D., 2006, P WORKSH ANN REAS TI, P1
[2]   Event extraction for systems biology by text mining the literature [J].
Ananiadou, Sophia ;
Pyysalo, Sampo ;
Tsujii, Jun'ichi ;
Kell, Douglas B. .
TRENDS IN BIOTECHNOLOGY, 2010, 28 (07) :381-390
[3]  
[Anonymous], 2022, P 2022 C N AM CHAPT
[4]  
[Anonymous], 2022, P 2022 C N AM CHAPT
[5]  
Bengio S, 2015, ADV NEUR IN, V28
[6]  
Bethard S., 2016, P 10 INT WORKSHOP SE, P1052
[7]  
Doddington G. R., 2004, P 4 INT C LANG RES E
[8]  
Du X, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P671
[9]   Clinical concept extraction: A methodology review [J].
Fu, Sunyang ;
Chen, David ;
He, Huan ;
Liu, Sijia ;
Moon, Sungrim ;
Peterson, Kevin J. ;
Shen, Feichen ;
Wang, Liwei ;
Wang, Yanshan ;
Wen, Andrew ;
Zhao, Yiqing ;
Sohn, Sunghwan ;
Liu, Hongfang .
JOURNAL OF BIOMEDICAL INFORMATICS, 2020, 109
[10]  
Harry Caufield J., 2019, medRxiv, P19009118