Biomedical word sense disambiguation with bidirectional long short-term memory and attention-based neural networks

被引:13
|
作者
Zhang, Canlin [1 ]
Bis, Daniel [2 ]
Liu, Xiuwen [2 ]
He, Zhe [3 ]
机构
[1] Florida State Univ, Dept Math, Tallahassee, FL 32306 USA
[2] Florida State Univ, Dept Comp Sci, Tallahassee, FL 32306 USA
[3] Florida State Univ, Sch Informat, Tallahassee, FL 32306 USA
关键词
Word sense disambiguation; LSTM; Self-attention; Biomedical;
D O I
10.1186/s12859-019-3079-8
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background In recent years, deep learning methods have been applied to many natural language processing tasks to achieve state-of-the-art performance. However, in the biomedical domain, they have not out-performed supervised word sense disambiguation (WSD) methods based on support vector machines or random forests, possibly due to inherent similarities of medical word senses. Results In this paper, we propose two deep-learning-based models for supervised WSD: a model based on bi-directional long short-term memory (BiLSTM) network, and an attention model based on self-attention architecture. Our result shows that the BiLSTM neural network model with a suitable upper layer structure performs even better than the existing state-of-the-art models on the MSH WSD dataset, while our attention model was 3 or 4 times faster than our BiLSTM model with good accuracy. In addition, we trained "universal" models in order to disambiguate all ambiguous words together. That is, we concatenate the embedding of the target ambiguous word to the max-pooled vector in the universal models, acting as a "hint". The result shows that our universal BiLSTM neural network model yielded about 90 percent accuracy. Conclusion Deep contextual models based on sequential information processing methods are able to capture the relative contextual information from pre-trained input word embeddings, in order to provide state-of-the-art results for supervised biomedical WSD tasks.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Biomedical word sense disambiguation with bidirectional long short-term memory and attention-based neural networks
    Canlin Zhang
    Daniel Biś
    Xiuwen Liu
    Zhe He
    BMC Bioinformatics, 20
  • [2] Layered Multistep Bidirectional Long Short-Term Memory Networks for Biomedical Word Sense Disambiguation
    Bis, Daniel
    Zhang, Canlin
    Liu, Xiuwen
    He, Zhe
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 313 - 320
  • [3] Word embeddings and recurrent neural networks based on Long-Short Term Memory nodes in supervised biomedical word sense disambiguation
    Yepes, Antonio Jimeno
    JOURNAL OF BIOMEDICAL INFORMATICS, 2017, 73 : 137 - 147
  • [4] Effective Attention-based Neural Architectures for Sentence Compression with Bidirectional Long Short-Term Memory
    Nhi-Thao Tran
    Viet-Thang Luong
    Ngan Luu-Thuy Nguyen
    Minh-Quoc Nghiem
    PROCEEDINGS OF THE SEVENTH SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY (SOICT 2016), 2016, : 123 - 130
  • [5] Biomedical Word Sense Disambiguation Based on Graph Attention Networks
    Zhang, Chun-Xiang
    Wang, Ming-Lei
    Gao, Xue-Yao
    IEEE ACCESS, 2022, 10 : 123328 - 123336
  • [6] Attention-based bidirectional-long short-term memory for abnormal human activity detection
    Kumar, Manoj
    Patel, Anoop Kumar
    Biswas, Mantosh
    Shitharth, S.
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [7] Attention-Based Joint Learning for Intent Detection and Slot Filling Using Bidirectional Long Short-Term Memory and Convolutional Neural Networks (AJLISBC)
    Muhammad, Yusuf Idris
    Salim, Naomie
    Huspi, Sharin Hazlin
    Zainal, Anazida
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (08) : 915 - 922
  • [8] Short-Term Traffic Congestion Forecasting Using Attention-Based Long Short-Term Memory Recurrent Neural Network
    Zhang, Tianlin
    Liu, Ying
    Cui, Zhenyu
    Leng, Jiaxu
    Xie, Weihong
    Zhang, Liang
    COMPUTATIONAL SCIENCE - ICCS 2019, PT III, 2019, 11538 : 304 - 314
  • [9] Speech emotion recognition based on convolutional neural network with attention-based bidirectional long short-term memory network and multi-task learning
    Liu, Zhen-Tao
    Han, Meng-Ting
    Wu, Bao-Han
    Rehman, Abdul
    APPLIED ACOUSTICS, 2023, 202
  • [10] Chinese word sense disambiguation based on neural networks
    刘挺
    卢志茂
    郎君
    李生
    Journal of Harbin Institute of Technology, 2005, (04) : 408 - 414