A Data-Driven Method of Discovering Misspellings of Medication Names on Twitter

被引:3
|
作者
Jiang, Keyuan [1 ]
Chen, Tingyu [1 ]
Huang, Liyuan [1 ]
Calix, Ricardo A. [1 ]
Bernard, Gordon R. [2 ]
机构
[1] Purdue Univ Northwest, Dept Comp Informat Technol & Graph, Hammond, IA USA
[2] Vanderbilt Univ, Dept Med, Nashville, TN USA
来源
BUILDING CONTINENTS OF KNOWLEDGE IN OCEANS OF DATA: THE FUTURE OF CO-CREATED EHEALTH | 2018年 / 247卷
基金
美国国家卫生研究院;
关键词
Information retrieval; Pharmacovigilance; Postmarking surveillance; Relational similarity; Twitter; Distributed word representation; Misspellings;
D O I
10.3233/978-1-61499-852-5-136
中图分类号
R-058 [];
学科分类号
摘要
Twitter, as a microblogging social media platform, has seen increasing applications of its data for pharmacovigilance which is to monitor and promote safe uses of pharmaceutical products. Medication names are typically used as keywords to query social media data. It is known that medication names are misspelled on social media, and finding the misspellings is challenging because there exists no a priori knowledge as to how people would misspell a medication name. We developed a data-driven, relational similarity-based approach to discover misspellings of medication names. Our approach is based upon the assumption of the identical (or similar) association of a medicine with its effects whether the medication is correctly spelled or misspelled. With distributed representations of the words in tweets posted in recent 24 months, we were able to discover a total of 54 misspellings of 6 medicines whose indications containing headache. Our search results also show that Twitter posts with misspellings of codeine and ibuprofen can be more than 10% of all the tweets associated with each of the medicines. Compared with the phonetics-based approach, our method discovered more actual misspellings used on Twitter.
引用
收藏
页码:136 / 140
页数:5
相关论文
共 50 条
  • [1] Data-driven strategies in operation management: mining user-generated content in Twitter
    Ramon Saura, Jose
    Ribeiro-Soriano, Domingo
    Palacios-Marques, Daniel
    ANNALS OF OPERATIONS RESEARCH, 2024, 333 (2-3) : 849 - 869
  • [2] Data-driven strategies in operation management: mining user-generated content in Twitter
    Jose Ramon Saura
    Domingo Ribeiro-Soriano
    Daniel Palacios-Marqués
    Annals of Operations Research, 2024, 333 : 849 - 869
  • [3] Discovering Potential Effects of Dietary Supplements from Twitter Data
    Jiang, Keyuan
    Tang, Yongbing
    Cook, G. Elliott
    Madden, Michael M.
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON DIGITAL HEALTH (DH'17), 2017, : 119 - 126
  • [4] Treemap Visualization: A Hierarchical Method for Discovering User Profiles on Twitter
    Lopez-Ornelas, Erick
    Abascal-Mena, Rocio
    COMPUTACION Y SISTEMAS, 2022, 26 (01): : 195 - 202
  • [5] Estimating geographic subjective well-being from Twitter: A comparison of dictionary and data-driven language methods
    Jaidka, Kokil
    Giorgi, Salvatore
    Schwartz, H. Andrew
    Kern, Margaret L.
    Ungar, Lyle H.
    Eichstaedt, Johannes C.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2020, 117 (19) : 10165 - 10171
  • [6] Data-driven approaches to information access
    Dumais, S
    COGNITIVE SCIENCE, 2003, 27 (03) : 491 - 524
  • [7] Towards Data-Driven Vulnerability Prediction for Requirements
    Imtiaz, Sayem Mohammad
    Bhowmik, Tanmay
    ESEC/FSE'18: PROCEEDINGS OF THE 2018 26TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, 2018, : 744 - 748
  • [8] Clinical Data-Driven Probabilistic Graph Processing
    Goodwin, Travis
    Harabagiu, Sanda
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
  • [9] Quantifying Data-Driven Campaigning Across Sponsors and Platforms
    Franz, Michael M.
    Zhang, Meiqing
    Ridout, Travis N.
    Oleinikov, Pavel
    Yao, Jielu
    Cakmak, Furkan
    Fowler, Erika Franklin
    MEDIA AND COMMUNICATION, 2024, 12
  • [10] Guiding data-driven design ideation by knowledge distance
    Luo, Jianxi
    Sarica, Serhad
    Wood, Kristin L.
    KNOWLEDGE-BASED SYSTEMS, 2021, 218