Overview of ChEMU 2022 Evaluation Campaign: Information Extraction in Chemical Patents

被引:0
作者
Li, Yuan [1 ]
Fang, Biaoyan [1 ]
He, Jiayuan [1 ,4 ]
Yoshikawa, Hiyori [1 ,5 ]
Akhondi, Saber A. [2 ]
Druckenbrodt, Christian [3 ]
Thorne, Camilo [3 ]
Afzal, Zubair [2 ]
Zhai, Zenan [1 ]
Baldwin, Timothy [1 ]
Verspoor, Karin [1 ,4 ]
机构
[1] Univ Melbourne, Melbourne, Vic, Australia
[2] Elsevier BV, Amsterdam, Netherlands
[3] Elsevier Informat Syst GmbH, Frankfurt, Germany
[4] RMIT Univ, Melbourne, Vic, Australia
[5] Fujitsu Ltd, Tokyo, Japan
来源
EXPERIMENTAL IR MEETS MULTILINGUALITY, MULTIMODALITY, AND INTERACTION (CLEF 2022) | 2022年 / 13390卷
基金
澳大利亚研究理事会;
关键词
Chemical patents; Text mining; Information Extraction; TEXT; DRUGS;
D O I
10.1007/978-3-031-13643-6_30
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we provide an overview of the Cheminformatics Elsevier Melbourne University (ChEMU) evaluation lab 2022, part of the Conference and Labs of the Evaluation Forum 2022 (CLEF 2022). The ChEMU campaign focuses on information extraction tasks over chemical reactions in patents. The ChEMU 2020 lab provided two information extraction tasks, named entity recognition and event extraction. The ChEMU 2021 lab introduced one more task, anaphora resolution. This year, we re-run all the three tasks with new test data. Together, the tasks support comprehensive automatic chemical patent analysis. Herein, we describe the resources created for these tasks and the evaluation methodology adopted. We also provide a brief summary of the methods employed by participants of this lab and the results obtained across 22 runs from 3 teams, finding that several submissions achieve better results than the baseline methods prepared by the organizers.
引用
收藏
页码:521 / 540
页数:20
相关论文
共 42 条
  • [1] Automatic identification of relevant chemical compounds from patents
    Akhondi, Saber A.
    Rey, Hinnerk
    Schwoerer, Markus
    Maier, Michael
    Toomey, John
    Nau, Heike
    Ilchmann, Gabriele
    Sheehan, Mark
    Irmer, Matthias
    Bobach, Claudia
    Doornenbal, Marius
    Gregory, Michelle
    Kors, Jan A.
    [J]. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2019,
  • [2] Chemical entity recognition in patents by combining dictionary-based and statistical approaches
    Akhondi, Saber A.
    Pons, Ewoud
    Afzal, Zubair
    van Haagen, Herman
    Becker, Benedikt F. H.
    Hettne, Kristina M.
    van Mulligen, Erik M.
    Kors, Jan A.
    [J]. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2016,
  • [3] Recognition of chemical entities: combining dictionary-based and grammar-based approaches
    Akhondi, Saber A.
    Hettne, Kristina M.
    van der Horst, Eelke
    van Mulligen, Erik M.
    Kors, Jan A.
    [J]. JOURNAL OF CHEMINFORMATICS, 2015, 7
  • [4] Annotated Chemical Patent Corpus: A Gold Standard for Text Mining
    Akhondi, Saber A.
    Klenner, Alexander G.
    Tyrchan, Christian
    Manchala, Anil K.
    Boppana, Kiran
    Lowe, Daniel
    Zimmermann, Marc
    Jagarlapudi, Sarma A. R. P.
    Sayle, Roger
    Kors, Jan A.
    Muresan, Sorel
    [J]. PLOS ONE, 2014, 9 (09):
  • [5] Overview of the BioCreative III Workshop
    Arighi, Cecilia N.
    Lu, Zhiyong
    Krallinger, Martin
    Cohen, Kevin B.
    Wilbur, W. John
    Valencia, Alfonso
    Hirschman, Lynette
    Wu, Cathy H.
    [J]. BMC BIOINFORMATICS, 2011, 12
  • [6] Patents: A unique source for scientific technical information in chemistry related industry?
    Bregonje, Mervyn
    [J]. WORLD PATENT INFORMATION, 2005, 27 (04) : 309 - 315
  • [7] Coreference annotation and resolution in the Colorado Richly Annotated Full Text (CRAFT) corpus of biomedical journal articles
    Cohen, K. Bretonnel
    Lanfranchi, Arrick
    Choi, Miji Joo-Young
    Bada, Michael
    Baumgartner, William A., Jr.
    Panteleyeva, Natalya
    Verspoor, Karin
    Palmer, Martha
    Hunter, Lawrence E.
    [J]. BMC BIOINFORMATICS, 2017, 18
  • [8] Dat Quoc Nguyen, 2020, Advances in Information Retrieval. 42nd European Conference on IR Research, ECIR 2020. Proceedings. Lecture Notes in Computer Science (LNCS 12036), P572, DOI 10.1007/978-3-030-45442-5_74
  • [9] Fang B., 2021, P 16 C EUROPEAN CHAP
  • [10] Farkas R., 2010, Proceedings of the Fourteenth Conference on Computational Natural Language Learning (CoNLL-2010): Shared Task, P1