Multimodal model with text and drug embeddings for adverse drug reaction classification

被引:13
作者
Sakhovskiy, Andrey [1 ,2 ]
Tutubalina, Elena [1 ,3 ,4 ]
机构
[1] Kazan Fed Univ, 18 Kremlyovskaya St, Kazan 420008, Russia
[2] Lomonosov Moscow State Univ, 1 Leninskie gory, Moscow 119991, Russia
[3] Sber AI, 19 Vavilova St, Moscow 117997, Russia
[4] Natl Res Univ, Higher Sch Econ, 11 Pokrovsky Bulvar, Moscow 109028, Russia
基金
俄罗斯科学基金会;
关键词
Natural language processing; Social media; Adverse drug reactions;
D O I
10.1016/j.jbi.2022.104182
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper, we focus on the classification of tweets as sources of potential signals for adverse drug effects (ADEs) or drug reactions (ADRs). Following the intuition that text and drug structure representations are complementary, we introduce a multimodal model with two components. These components are state-of-the-art BERT-based models for language understanding and molecular property prediction. Experiments were carried out on multilingual benchmarks of the Social Media Mining for Health Research and Applications (#SMM4H) initiative. Our models obtained state-of-the-art results of 0.61 F1-measure and 0.57 F1-measure on #SMM4H 2021 Shared Tasks 1a and 2 in English and Russian, respectively. On the classification of French tweets from SMM4H 2020 Task 1, our approach pushes the state of the art by an absolute gain of 8% F1. Our experiments show that the molecular information obtained from neural networks is more beneficial for ADE classification than traditional molecular descriptors. The source code for our models is freely available at https://github.com/Andoree/smm4h_2021_classification.
引用
收藏
页数:10
相关论文
共 45 条
  • [21] TrimNet: learning molecular representation from triplet messages for biomedicine
    Li, Pengyong
    Li, Yuquan
    Hsieh, Chang-Yu
    Zhang, Shengyu
    Liu, Xianggen
    Liu, Huanxiang
    Song, Sen
    Yao, Xiaojun
    [J]. BRIEFINGS IN BIOINFORMATICS, 2021, 22 (04)
  • [22] Liu YH, 2019, Arxiv, DOI [arXiv:1907.11692, DOI 10.48550/ARXIV.1907.11692]
  • [23] Magge A, 2021, P 6 SOCIAL MEDIA MIN, P21, DOI DOI 10.18653/V1/2021.SMM4H-1.4
  • [24] Martin L., 2020, P 58 ANN M ASS COMP, DOI [10.18653/v1/2020.acl-main.645, DOI 10.18653/V1/2020.ACL-MAIN.645]
  • [25] Miftahutdinov Zulfat, 2020, Advances in Information Retrieval. 42nd European Conference on IR Research, ECIR 2020. Proceedings. Lecture Notes in Computer Science (LNCS 12036), P281, DOI 10.1007/978-3-030-45442-5_35
  • [26] Miftahutdinov Z., 2020, P 5 SOCIAL MEDIA MIN, P51
  • [27] Mikolov T., 2013, 1 INT C LEARN REPRES
  • [28] Ministry of Health of the Russian Federation, 2021, STAT REG MED REM
  • [29] Mordred: a molecular descriptor calculator
    Moriwaki, Hirotomo
    Tian, Yu-Shi
    Kawashita, Norihito
    Takagi, Tatsuya
    [J]. JOURNAL OF CHEMINFORMATICS, 2018, 10
  • [30] Paszke A, 2019, ADV NEUR IN, V32