Enhancing Entity Representations with Prompt Learning for Biomedical Entity Linking

被引：0

作者：

Zhu, Tiantian ^{[1
,2
]}

Qin, Yang ^{[1
]}

Chen, Qingcai ^{[1
,2
]}

Hu, Baotian ^{[1
]}

Xiang, Yang ^{[2
]}

机构：

[1] Harbin Inst Technol Shenzhen, Shenzhen, Peoples R China

[2] Peng Cheng Lab, Shenzhen, Peoples R China

来源：

PROCEEDINGS OF THE THIRTY-FIRST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2022 | 2022年

基金：

中国国家自然科学基金;

关键词：

RECOGNITION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Biomedical entity linking aims to map mentions in biomedical text to standardized concepts or entities in a curated knowledge base (KB) such as Unified Medical Language System (UMLS). The latest research tends to solve this problem in a unified framework solely based on surface form matching between mentions and entities. Specifically, these methods focus on addressing the variety challenge of the heterogeneous naming of biomedical concepts. Yet, the ambiguity challenge that the same word under different contexts may refer to distinct entities is usually ignored. To address this challenge, we propose a two-stage linking algorithm to enhance the entity representations based on prompt learning. The first stage includes a coarser-grained retrieval from a representation space defined by a bi-encoder that independently embeds the mention and entity's surface forms. Unlike previous one-model-fits-all systems, each candidate is then re-ranked with a finer-grained encoder based on prompt-tuning that utilizes the contextual information. Extensive experiments show that our model achieves promising performance improvements compared with several state-of-the-art techniques on the largest biomedical public dataset MedMentions and the NCBI disease corpus. We also observe by cases that the proposed prompt-tuning strategy is effective in solving both the variety and ambiguity challenges in the linking task.

引用

页码：4036 / 4042

页数：7

共 23 条

[1]

Brown TB, 2020, ADV NEUR IN, V33

[2]

D'Souza J, 2015, PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, P297

[3]

Davis Allan Peter, 2012, DATABASE J BIOL DATA

[4]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[5] NCBI disease corpus: A resource for disease name recognition and concept normalization [J].

Dogan, Rezarta Islamaj ;

Leaman, Robert ;

Lu, Zhiyong .

JOURNAL OF BIOMEDICAL INFORMATICS, 2014, 47 :1-10

[6]

Feldman J, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P1173

[7]

Gao TY, 2021, 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, P3816

[8]

Ji Zongcheng, 2020, AMIA Jt Summits Transl Sci Proc, V2020, P269

[9] How Can We Know What Language Models Know? [J].

Jiang, Zhengbao ;

Xu, Frank F. ;

Araki, Jun ;

Neubig, Graham .

TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 :423-438

[10] TaggerOne: joint named entity recognition and normalization with semi-Markov Models [J].

Leaman, Robert ;

Lu, Zhiyong .

BIOINFORMATICS, 2016, 32 (18) :2839-2846

← 1 2 3 →