BioRel: towards large-scale biomedical relation extraction

被引:19
|
作者
Xing, Rui [1 ]
Luo, Jie [1 ]
Song, Tengwei [1 ]
机构
[1] Beihang Univ, Sch Comp Sci & Engn, State Key Lab Software Dev Environm, 37 Xueyuan Rd, Beijing 100191, Peoples R China
基金
中国国家自然科学基金;
关键词
Distant supervision; Relation extraction; Information extraction; Medline;
D O I
10.1186/s12859-020-03889-5
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundAlthough biomedical publications and literature are growing rapidly, there still lacks structured knowledge that can be easily processed by computer programs. In order to extract such knowledge from plain text and transform them into structural form, the relation extraction problem becomes an important issue. Datasets play a critical role in the development of relation extraction methods. However, existing relation extraction datasets in biomedical domain are mainly human-annotated, whose scales are usually limited due to their labor-intensive and time-consuming nature.ResultsWe construct BioRel, a large-scale dataset for biomedical relation extraction problem, by using Unified Medical Language System as knowledge base and Medline as corpus. We first identify mentions of entities in sentences of Medline and link them to Unified Medical Language System with Metamap. Then, we assign each sentence a relation label by using distant supervision. Finally, we adapt the state-of-the-art deep learning and statistical machine learning methods as baseline models and conduct comprehensive experiments on the BioRel dataset.ConclusionsBased on the extensive experimental results, we have shown that BioRel is a suitable large-scale datasets for biomedical relation extraction, which provides both reasonable baseline performance and many remaining challenges for both deep learning and statistical methods.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Large-scale biomedical image analysis in grid environments
    Kumar, Vijay S.
    Rutt, Benjamin
    Kurc, Tahsin
    Catalyurek, Umit V.
    Pan, Tony C.
    Chow, Sunny
    Lamont, Stephan
    Martone, Maryann
    Saltz, Joel H.
    IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, 2008, 12 (02): : 154 - 161
  • [42] Interactive Histology of Large-Scale Biomedical Image Stacks
    Jeong, Won-Ki
    Schneider, Jens
    Turney, Stephen G.
    Faulkner-Jones, Beverly E.
    Meyer, Dominik
    Westermann, Ruediger
    Reid, R. Clay
    Lichtman, Jeff
    Pfister, Hanspeter
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2010, 16 (06) : 1386 - 1395
  • [43] Large-scale directional relationship extraction and resolution
    Cory B Giles
    Jonathan D Wren
    BMC Bioinformatics, 9
  • [44] HURRICANE CYCLES AND THEIR RELATION TO THE LARGE-SCALE CIRCULATION
    SHAPIRO, LJ
    BULLETIN OF THE AMERICAN METEOROLOGICAL SOCIETY, 1980, 61 (09) : 1121 - 1121
  • [45] Information extraction system in large-scale web
    Hong, F
    Zhao, Z
    International Symposium on Communications and Information Technologies 2005, Vols 1 and 2, Proceedings, 2005, : 783 - 786
  • [46] Large-scale directional relationship extraction and resolution
    Giles, Cory B.
    Wren, Jonathan D.
    BMC BIOINFORMATICS, 2008, 9 (Suppl 9)
  • [47] REACHABILITY MATRIX AND RELATION TO LARGE-SCALE SYSTEMS
    EVANGELATOS, DS
    NICHOLSON, H
    INTERNATIONAL JOURNAL OF CONTROL, 1988, 47 (05) : 1163 - 1177
  • [48] ELSKE: Efficient Large-Scale Keyphrase Extraction
    Knittel, Johannes
    Koch, Steffen
    Ertl, Thomas
    PROCEEDINGS OF THE 21ST ACM SYMPOSIUM ON DOCUMENT ENGINEERING (DOCENG '21), 2021,
  • [49] Feature Extraction for Large-Scale Text Collections
    Gallagher, Luke
    Mallia, Antonio
    Culpepper, J. Shane
    Suel, Torsten
    Cambazoglu, B. Barla
    CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 3015 - 3022
  • [50] Entity Relation Mining in Large-Scale Data
    Li, Jingnan
    Cai, Yi
    Wang, Qixuan
    Hu, Shuyue
    Wang, Tao
    Min, Huaqing
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2015, 2015, 9052 : 109 - 121