Multimodal model with text and drug embeddings for adverse drug reaction classification

被引:13
作者
Sakhovskiy, Andrey [1 ,2 ]
Tutubalina, Elena [1 ,3 ,4 ]
机构
[1] Kazan Fed Univ, 18 Kremlyovskaya St, Kazan 420008, Russia
[2] Lomonosov Moscow State Univ, 1 Leninskie gory, Moscow 119991, Russia
[3] Sber AI, 19 Vavilova St, Moscow 117997, Russia
[4] Natl Res Univ, Higher Sch Econ, 11 Pokrovsky Bulvar, Moscow 109028, Russia
基金
俄罗斯科学基金会;
关键词
Natural language processing; Social media; Adverse drug reactions;
D O I
10.1016/j.jbi.2022.104182
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper, we focus on the classification of tweets as sources of potential signals for adverse drug effects (ADEs) or drug reactions (ADRs). Following the intuition that text and drug structure representations are complementary, we introduce a multimodal model with two components. These components are state-of-the-art BERT-based models for language understanding and molecular property prediction. Experiments were carried out on multilingual benchmarks of the Social Media Mining for Health Research and Applications (#SMM4H) initiative. Our models obtained state-of-the-art results of 0.61 F1-measure and 0.57 F1-measure on #SMM4H 2021 Shared Tasks 1a and 2 in English and Russian, respectively. On the classification of French tweets from SMM4H 2020 Task 1, our approach pushes the state of the art by an absolute gain of 8% F1. Our experiments show that the molecular information obtained from neural networks is more beneficial for ADE classification than traditional molecular descriptors. The source code for our models is freely available at https://github.com/Andoree/smm4h_2021_classification.
引用
收藏
页数:10
相关论文
共 45 条
  • [1] The Russian Drug Reaction Corpus and neural models for drug reactions and effectiveness detection in user reviews
    Tutubalina, Elena
    Alimova, Ilseyar
    Miftahutdinov, Zulfat
    Sakhovskiy, Andrey
    Malykh, Valentin
    Nikolenko, Sergey
    [J]. BIOINFORMATICS, 2021, 37 (02) : 243 - 249
  • [2] Asada M, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, P680
  • [3] Bird S., 2009, NATURAL LANGUAGE PRO
  • [4] Bojanowski P., 2017, Trans. Assoc. Comput. Linguistics, V5, P135, DOI [DOI 10.1162/TACLA00051, 10.1162/tacl_a_00051, DOI 10.1162/TACL_A_00051]
  • [5] Boser B. E., 1992, Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, P144, DOI 10.1145/130385.130401
  • [6] Predicting Anatomical Therapeutic Chemical (ATC) Classification of Drugs by Integrating Chemical-Chemical Interactions and Similarities
    Chen, Lei
    Zeng, Wei-Ming
    Cai, Yu-Dong
    Feng, Kai-Yan
    Chou, Kuo-Chen
    [J]. PLOS ONE, 2012, 7 (04):
  • [7] Chithrananda S, 2020, Arxiv, DOI [arXiv:2010.09885, 10.48550/arXiv.2010.09885]
  • [8] Consonni V, 2010, CHALL ADV COMPUT CHE, V8, P29, DOI 10.1007/978-1-4020-9783-6_3
  • [9] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [10] DrugClust: A machine learning approach for drugs side effects prediction
    Dimitri, Giovanna Maria
    Lio, Pietro
    [J]. COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2017, 68 : 204 - 210