MLEC-QA: A Chinese Multi-Choice Biomedical Question Answering Dataset

被引:0
|
作者
Li, Jing [1 ]
Zhong, Shangping [1 ]
Chen, Kaizhi [1 ]
机构
[1] Fuzhou Univ, Coll Comp & Data Sci, Fuzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
SYSTEM;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Question Answering (QA) has been successfully applied in scenarios of human-computer interaction such as chatbots and search engines. However, for the specific biomedical domain, QA systems are still immature due to expert-annotated datasets being limited by category and scale. In this paper, we present MLEC-QA, the largest-scale Chinese multi-choice biomedical QA dataset, collected from the National Medical Licensing Examination in China. The dataset is composed of five subsets with 136,236 biomedical multi-choice questions with extra materials (images or tables) annotated by human experts, and first covers the following biomedical sub-fields: Clinic, Stomatology, Public Health, Traditional Chinese Medicine, and Traditional Chinese Medicine Combined with Western Medicine. We implement eight representative control methods and open-domain QA methods as baselines. Experimental results demonstrate that even the current best model can only achieve accuracies between 40% to 55% on five subsets, especially performing poorly on questions that require sophisticated reasoning ability. We hope the release of the MLEC-QA dataset can serve as a valuable resource for research and evaluation in open-domain QA, and also make advances for biomedical QA systems.(1)
引用
收藏
页码:8862 / 8874
页数:13
相关论文
共 50 条
  • [1] SKR-QA: Semantic ranking and knowledge revise for multi-choice question answering
    Ren, Mucheng
    Huang, Heyan
    Gao, Yang
    NEUROCOMPUTING, 2021, 459 : 142 - 151
  • [2] Winnowing Knowledge for Multi-choice Question Answering
    Li, Yeqiu
    Zou, Bowei
    Li, Zhifeng
    Aw, Ai Ti
    Hong, Yu
    Zhu, Qiaoming
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1157 - 1165
  • [3] MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering
    Pal, Ankit
    Umapathi, Logesh Kumar
    Sankarasubbu, Malaikannan
    CONFERENCE ON HEALTH, INFERENCE, AND LEARNING, VOL 174, 2022, 174 : 248 - 260
  • [4] Two layers LSTM with attention for multi-choice question answering in exams
    Li, Yongbin
    INTERNATIONAL CONFERENCE ON FUNCTIONAL MATERIALS AND CHEMICAL ENGINEERING (ICFMCE 2017), 2018, 323
  • [5] A Legal Multi-Choice Question Answering Model Based on BERT and Attention
    Chen, Guibin
    Luo, Xudong
    Zhu, Junlin
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT IV, KSEM 2023, 2023, 14120 : 250 - 266
  • [6] BERT-CNN based evidence retrieval and aggregation for Chinese legal multi-choice question answering
    Yanling Li
    Jiaye Wu
    Xudong Luo
    Neural Computing and Applications, 2024, 36 : 5909 - 5925
  • [7] BERT-CNN based evidence retrieval and aggregation for Chinese legal multi-choice question answering
    Li, Yanling
    Wu, Jiaye
    Luo, Xudong
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (11): : 5909 - 5925
  • [8] PubMedQA: A Dataset for Biomedical Research Question Answering
    Jin, Qiao
    Dhingra, Bhuwan
    Liu, Zhengping
    Cohen, William W.
    Lu, Xinghua
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 2567 - 2577
  • [9] ECG-QA: A Comprehensive Question Answering Dataset Combined With Electrocardiogram
    Oh, Jungwoo
    Lee, Gyubok
    Bae, Seongsu
    Kwon, Joon-Myoung
    Choi, Edward
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] JEC-QA: A Legal-Domain Question Answering Dataset
    Zhong, Haoxi
    Xiao, Chaojun
    Tu, Cunchao
    Zhang, Tianyang
    Liu, Zhiyuan
    Sun, Maosong
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9701 - 9708