Large-scale automatic extraction of side effects associated with targeted anticancer drugs from full-text oncological articles

被引：20

作者：

Xu, Rong ^{[1
]}

Wang, QuanQiu ^{[2
]}

机构：

[1] Case Western Reserve Univ, Med Informat Program, Ctr Clin Invest, Cleveland, OH 44106 USA

[2] ThinTek LLC, Palo Alto, CA 94306 USA

来源：

JOURNAL OF BIOMEDICAL INFORMATICS | 2015年 / 55卷

基金：

美国国家卫生研究院;

关键词：

Text mining; Information extraction; Targeted anticancer drugs; Drug side effects; Drug discovery; Drug repositioning; Drug toxicity prediction; RENAL-CELL CARCINOMA; KNOWLEDGE-BASE; BIOMEDICAL LITERATURE; CANCER; CARDIOTOXICITY; THERAPEUTICS; THERAPIES; TOXICITY; PATHWAYS; EVENTS;

D O I：

10.1016/j.jbi.2015.03.009

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Targeted anticancer drugs such as imatinib, trastuzumab and erlotinib dramatically improved treatment outcomes in cancer patients, however, these innovative agents are often associated with unexpected side effects. The pathophysiological mechanisms underlying these side effects are not well understood. The availability of a comprehensive knowledge base of side effects associated with targeted anticancer drugs has the potential to illuminate complex pathways underlying toxicities induced by these innovative drugs. While side effect association knowledge for targeted drugs exists in multiple heterogeneous data sources, published full-text oncological articles represent an important source of pivotal, investigational, and even failed trials in a variety of patient populations. In this study, we present an automatic process to extract targeted anticancer drug-associated side effects (drug-SE pairs) from a large number of high profile full-text oncological articles. We downloaded 13,855 full-text articles from the Journal of Oncology (JCO) published between 1983 and 2013. We developed text classification, relationship extraction, signaling filtering, and signal prioritization algorithms to extract drug-SE pairs from downloaded articles. We extracted a total of 26,264 drug-SE pairs with an average precision of 0.405, a recall of 0.899, and an F1 score of 0.465. We show that side effect knowledge from JCO articles is largely complementary to that from the US Food and Drug Administration (FDA) drug labels. Through integrative correlation analysis, we show that targeted drug-associated side effects positively correlate with their gene targets and disease indications. In conclusion, this unique database that we built from a large number of high-profile oncological articles could facilitate the development of computational models to understand toxic effects associated with targeted anticancer drugs. (C) 2015 Elsevier Inc. All rights reserved.

引用

页码：64 / 72

页数：9

共 8 条

[1] Combining automatic table classification and relationship extraction in extracting anticancer drug-side effect pairs from full-text articles
Xu, Rong
Wang, QuanQiu
JOURNAL OF BIOMEDICAL INFORMATICS, 2015, 53 : 128 - 135
[2] Automatic construction of a large-scale and accurate drug-side-effect association knowledge base from biomedical literature
Xu, Rong
Wang, QuanQiu
JOURNAL OF BIOMEDICAL INFORMATICS, 2014, 51 : 191 - 199
[3] Temporal knowledge extraction from large-scale text corpus
Yu Liu
Wen Hua
Xiaofang Zhou
World Wide Web, 2021, 24 : 135 - 156
[4] Temporal knowledge extraction from large-scale text corpus
Liu, Yu
Hua, Wen
Zhou, Xiaofang
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2021, 24 (01): : 135 - 156
[5] Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research
Àlex Bravo
Janet Piñero
Núria Queralt-Rosinach
Michael Rautschka
Laura I Furlong
BMC Bioinformatics, 16
[6] Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research
Bravo, Alex
Pinero, Janet
Queralt-Rosinach, Nuria
Rautschka, Michael
Furlong, Laura I.
BMC BIOINFORMATICS, 2015, 16
[7] Comparing a knowledge-driven approach to a supervised machine learning approach in large-scale extraction of drug-side effect relationships from free-text biomedical literature
Rong Xu
QuanQiu Wang
BMC Bioinformatics, 16
[8] Automatic signal extraction, prioritizing and filtering approaches in detecting post-marketing cardiovascular events associated with targeted cancer drugs from the FDA Adverse Event Reporting System (FAERS)
Xu, Rong
Wang, QuanQiu
JOURNAL OF BIOMEDICAL INFORMATICS, 2014, 47 : 171 - 177

← 1 →