Large-scale automatic extraction of side effects associated with targeted anticancer drugs from full-text oncological articles

被引:20
|
作者
Xu, Rong [1 ]
Wang, QuanQiu [2 ]
机构
[1] Case Western Reserve Univ, Med Informat Program, Ctr Clin Invest, Cleveland, OH 44106 USA
[2] ThinTek LLC, Palo Alto, CA 94306 USA
基金
美国国家卫生研究院;
关键词
Text mining; Information extraction; Targeted anticancer drugs; Drug side effects; Drug discovery; Drug repositioning; Drug toxicity prediction; RENAL-CELL CARCINOMA; KNOWLEDGE-BASE; BIOMEDICAL LITERATURE; CANCER; CARDIOTOXICITY; THERAPEUTICS; THERAPIES; TOXICITY; PATHWAYS; EVENTS;
D O I
10.1016/j.jbi.2015.03.009
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Targeted anticancer drugs such as imatinib, trastuzumab and erlotinib dramatically improved treatment outcomes in cancer patients, however, these innovative agents are often associated with unexpected side effects. The pathophysiological mechanisms underlying these side effects are not well understood. The availability of a comprehensive knowledge base of side effects associated with targeted anticancer drugs has the potential to illuminate complex pathways underlying toxicities induced by these innovative drugs. While side effect association knowledge for targeted drugs exists in multiple heterogeneous data sources, published full-text oncological articles represent an important source of pivotal, investigational, and even failed trials in a variety of patient populations. In this study, we present an automatic process to extract targeted anticancer drug-associated side effects (drug-SE pairs) from a large number of high profile full-text oncological articles. We downloaded 13,855 full-text articles from the Journal of Oncology (JCO) published between 1983 and 2013. We developed text classification, relationship extraction, signaling filtering, and signal prioritization algorithms to extract drug-SE pairs from downloaded articles. We extracted a total of 26,264 drug-SE pairs with an average precision of 0.405, a recall of 0.899, and an F1 score of 0.465. We show that side effect knowledge from JCO articles is largely complementary to that from the US Food and Drug Administration (FDA) drug labels. Through integrative correlation analysis, we show that targeted drug-associated side effects positively correlate with their gene targets and disease indications. In conclusion, this unique database that we built from a large number of high-profile oncological articles could facilitate the development of computational models to understand toxic effects associated with targeted anticancer drugs. (C) 2015 Elsevier Inc. All rights reserved.
引用
收藏
页码:64 / 72
页数:9
相关论文
共 8 条
  • [1] Combining automatic table classification and relationship extraction in extracting anticancer drug-side effect pairs from full-text articles
    Xu, Rong
    Wang, QuanQiu
    JOURNAL OF BIOMEDICAL INFORMATICS, 2015, 53 : 128 - 135
  • [2] Automatic construction of a large-scale and accurate drug-side-effect association knowledge base from biomedical literature
    Xu, Rong
    Wang, QuanQiu
    JOURNAL OF BIOMEDICAL INFORMATICS, 2014, 51 : 191 - 199
  • [3] Temporal knowledge extraction from large-scale text corpus
    Yu Liu
    Wen Hua
    Xiaofang Zhou
    World Wide Web, 2021, 24 : 135 - 156
  • [4] Temporal knowledge extraction from large-scale text corpus
    Liu, Yu
    Hua, Wen
    Zhou, Xiaofang
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2021, 24 (01): : 135 - 156
  • [5] Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research
    Àlex Bravo
    Janet Piñero
    Núria Queralt-Rosinach
    Michael Rautschka
    Laura I Furlong
    BMC Bioinformatics, 16
  • [6] Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research
    Bravo, Alex
    Pinero, Janet
    Queralt-Rosinach, Nuria
    Rautschka, Michael
    Furlong, Laura I.
    BMC BIOINFORMATICS, 2015, 16
  • [7] Comparing a knowledge-driven approach to a supervised machine learning approach in large-scale extraction of drug-side effect relationships from free-text biomedical literature
    Rong Xu
    QuanQiu Wang
    BMC Bioinformatics, 16
  • [8] Automatic signal extraction, prioritizing and filtering approaches in detecting post-marketing cardiovascular events associated with targeted cancer drugs from the FDA Adverse Event Reporting System (FAERS)
    Xu, Rong
    Wang, QuanQiu
    JOURNAL OF BIOMEDICAL INFORMATICS, 2014, 47 : 171 - 177