An imConvNet-based deep learning model for Chinese medical named entity recognition

被引:4
|
作者
Zheng, Yuchen [1 ]
Han, Zhenggong [2 ]
Cai, Yimin [1 ]
Duan, Xubo [1 ]
Sun, Jiangling [3 ]
Yang, Wei [1 ]
Huang, Haisong [2 ]
机构
[1] Guizhou Univ, Med Coll, Guiyang 550025, Guizhou, Peoples R China
[2] Guizhou Univ, Key Lab Adv Mfg Technol, Minist Educ, Guiyang 550025, Guizhou, Peoples R China
[3] Guiyang Hosp Stomatol, Guiyang 550002, Guizhou, Peoples R China
关键词
Named entity recognition; Convolutional neural network; Chinese electronic medical records; BiLSTM-CRF; BERT; BIG DATA; HEALTH; CARE;
D O I
10.1186/s12911-022-02049-4
中图分类号
R-058 [];
学科分类号
摘要
Background With the development of current medical technology, information management becomes perfect in the medical field. Medical big data analysis is based on a large amount of medical and health data stored in the electronic medical system, such as electronic medical records and medical reports. How to fully exploit the resources of information included in these medical data has always been the subject of research by many scholars. The basis for text mining is named entity recognition (NER), which has its particularities in the medical field, where issues such as inadequate text resources and a large number of professional domain terms continue to face significant challenges in medical NER. Methods We improved the convolutional neural network model (imConvNet) to obtain additional text features. Concurrently, we continue to use the classical Bert pre-training model and BiLSTM model for named entity recognition. We use imConvNet model to extract additional word vector features and improve named entity recognition accuracy. The proposed model, named BERT-imConvNet-BiLSTM-CRF, is composed of four layers: BERT embedding layer-getting word embedding vector; imConvNet layer-capturing the context feature of each character; BiLSTM (Bidirectional Long Short-Term Memory) layer-capturing the long-distance dependencies; CRF (Conditional Random Field) layer-labeling characters based on their features and transfer rules. Results The average F1 score on the public medical data set yidu-s4k reached 91.38% when combined with the classical model; when real electronic medical record text in impacted wisdom teeth is used as the experimental object, the model's F1 score is 93.89%. They all show better results than classical models. Conclusions The suggested novel model (imConvNet) significantly improves the recognition accuracy of Chinese medical named entities and applies to various medical corpora.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] A Research Toward Chinese Named Entity Recognition Based on Transfer Learning
    Hui Kang
    Jingwu Xiao
    Yunpeng Zhang
    Lei Zhang
    Xu Zhao
    Tie Feng
    International Journal of Computational Intelligence Systems, 16
  • [32] CRF-based Active Learning for Chinese Named Entity Recognition
    Yao, Lin
    Sun, Chengjie
    Li, Shaofeng
    Wang, Xiaolong
    Wang, Xuan
    2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 1557 - +
  • [33] Data Masking for Chinese Electronic Medical Records with Named Entity Recognition
    He, Tianyu
    Xu, Xiaolong
    Hu, Zhichen
    Zhao, Qingzhan
    Dai, Jianguo
    Dai, Fei
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 36 (03) : 3657 - 3673
  • [34] BIBC: A Chinese Named Entity Recognition Model for Diabetes Research
    Yang, Lei
    Fu, Yufan
    Dai, Yu
    APPLIED SCIENCES-BASEL, 2021, 11 (20):
  • [35] A hybrid approach for named entity recognition in Chinese electronic medical record
    Ji, Bin
    Liu, Rui
    Li, Shasha
    Yu, Jie
    Wu, Qingbo
    Tan, Yusong
    Wu, Jiaju
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (Suppl 2)
  • [36] Named Entity Recognition for Amharic Using Deep Learning
    Gamback, Bjorn
    Sikdar, Utpal Kumar
    2017 IST-AFRICA WEEK CONFERENCE (IST-AFRICA), 2017,
  • [37] Deep Learning Architectures for Named Entity Recognition: A Survey
    Thomas, Anu
    Sangeetha, S.
    ADVANCED COMPUTING AND INTELLIGENT ENGINEERING, 2020, 1082 : 215 - 225
  • [38] Named Entity Recognition of Chinese Electronic Medical Records Based on Cascaded Conditional Random Field
    Chen, Xiaoyu
    Shi, Shenghui
    Zhan, Siyan
    Jiang, Daguang
    Lin, Xiaoyong
    2019 4TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (ICBDA 2019), 2019, : 364 - 368
  • [39] A Method of Chinese Tourism Named Entity Recognition Based on BBLC Model
    Xue, Leyi
    Cao, Han
    Ye, Fan
    Qin, Yuehua
    2019 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI 2019), 2019, : 1722 - 1727
  • [40] Medical Named Entity Recognition Model Based on Knowledge Graph Enhancement
    Lu, Yonghe
    Zhao, Ruijie
    Wen, Xiuxian
    Tong, Xinyu
    Xiang, Dingcheng
    Zhang, Jinxia
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2024, 38 (04)