BioRel: towards large-scale biomedical relation extraction

被引:19
|
作者
Xing, Rui [1 ]
Luo, Jie [1 ]
Song, Tengwei [1 ]
机构
[1] Beihang Univ, Sch Comp Sci & Engn, State Key Lab Software Dev Environm, 37 Xueyuan Rd, Beijing 100191, Peoples R China
基金
中国国家自然科学基金;
关键词
Distant supervision; Relation extraction; Information extraction; Medline;
D O I
10.1186/s12859-020-03889-5
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundAlthough biomedical publications and literature are growing rapidly, there still lacks structured knowledge that can be easily processed by computer programs. In order to extract such knowledge from plain text and transform them into structural form, the relation extraction problem becomes an important issue. Datasets play a critical role in the development of relation extraction methods. However, existing relation extraction datasets in biomedical domain are mainly human-annotated, whose scales are usually limited due to their labor-intensive and time-consuming nature.ResultsWe construct BioRel, a large-scale dataset for biomedical relation extraction problem, by using Unified Medical Language System as knowledge base and Medline as corpus. We first identify mentions of entities in sentences of Medline and link them to Unified Medical Language System with Metamap. Then, we assign each sentence a relation label by using distant supervision. Finally, we adapt the state-of-the-art deep learning and statistical machine learning methods as baseline models and conduct comprehensive experiments on the BioRel dataset.ConclusionsBased on the extensive experimental results, we have shown that BioRel is a suitable large-scale datasets for biomedical relation extraction, which provides both reasonable baseline performance and many remaining challenges for both deep learning and statistical methods.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] TOWARDS UNDERSTANDING THE LARGE-SCALE STRUCTURE
    DEKEL, A
    IAU SYMPOSIA, 1987, (124): : 415 - 432
  • [32] Towards large-scale graphene transfer
    Qing, Fangzhu
    Zhang, Yufeng
    Niu, Yuting
    Stehle, Richard
    Chen, Yuanfu
    Li, Xuesong
    NANOSCALE, 2020, 12 (20) : 10890 - 10911
  • [33] TOWARDS A LARGE-SCALE PREPARATION OF MEXIPROSTIL
    VANHIJFTE, L
    KOLB, M
    TETRAHEDRON, 1992, 48 (31) : 6393 - 6402
  • [34] Towards large-scale entropy computations
    Karamanos, K
    Kotsireas, I
    COMPUTING ANTICIPATORY SYSTEMS, 2004, 718 : 385 - 391
  • [35] Towards large-scale information integration
    Anderson, KM
    Sherba, SA
    Lepthien, WV
    ICSE 2002: PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, 2002, : 524 - 534
  • [36] HacRED: A Large-Scale Relation Extraction Dataset Toward Hard Cases in Practical Applications
    Cheng, Qiao
    Liu, Juntao
    Qu, Xiaoye
    Zhao, Jin
    Liang, Jiaqing
    Wang, Zhefeng
    Huai, Baoxing
    Yuan, Nicholas Jing
    Xiao, Yanghua
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 2819 - 2831
  • [37] Evaluation of a large-scale biomedical data annotation initiative
    Lacson, Ronilda
    Pitzer, Erik
    Hinske, Christian
    Galante, Pedro
    Ohno-Machado, Lucila
    BMC BIOINFORMATICS, 2009, 10
  • [38] Evaluation of a large-scale biomedical data annotation initiative
    Ronilda Lacson
    Erik Pitzer
    Christian Hinske
    Pedro Galante
    Lucila Ohno-Machado
    BMC Bioinformatics, 10
  • [39] Large-Scale Reasoning over Functions in Biomedical Ontologies
    Hoehndorf, Robert
    Mencel, Liam
    Gkoutos, Georgios V.
    Schofield, Paul N.
    FORMAL ONTOLOGY IN INFORMATION SYSTEMS, 2016, 283 : 299 - 312
  • [40] Topological analysis of large-scale biomedical terminology structures
    Bales, Michael E.
    Lussier, Yves A.
    Johnson, Stephen B.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2007, 14 (06) : 788 - 797