Prediction of adverse drug reactions using demographic and non-clinical drug characteristics in FAERS data ( vol 14 , 23636, 2024)

被引:0
作者
Farnoush, Alireza [1 ]
Sedighi-Maman, Zahra [1 ]
Rasoolian, Behnam [3 ]
Heath, Jonathan J. [2 ,3 ]
Fallah, Banafsheh [3 ]
机构
[1] Adelphi Univ, Robert B Willumstad Sch Business, Garden City, NY 11756 USA
[2] St Bonaventure Univ, Sch Business, St Bonaventure, NY 14778 USA
[3] Georgetown Univ, McDonough Sch Business, Washington, DC 20057 USA
来源
SCIENTIFIC REPORTS | 2024年 / 14卷 / 01期
关键词
Adverse drug reactions; Deep learning; Demographic features; Machine learning; Non-clinical features; Random forest;
D O I
10.1038/s41598-024-78282-w
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The presence of adverse drug reactions (ADRs) is an ongoing public health concern. While traditional methods to discover ADRs are very costly and limited, it is prudent to predict ADRs through non-invasive methods such as machine learning based on existing data. Although various studies exist regarding ADR prediction using non-clinical data, a process that leverages both demographic and non-clinical data for ADR prediction is missing. In addition, the importance of individual features in ADR prediction has yet to be fully explored. This study aims to develop an ADR prediction model based on demographic and non-clinical data, where we identify the highest contributing factors. We focus our efforts on 30 common and severe ADRs reported to the Food and Drug Administration (FDA) between 2012 and 2023. We have developed a random forest (RF) and deep learning (DL) machine learning model that ingests demographic data (e.g., Age and Gender of patients) and non-clinical data, which includes chemical, molecular, and biological drug characteristics. We successfully unified both demographic and non-clinical data sources within a complete dataset regarding ADR prediction. Model performances were assessed via the area under the receiver operating characteristic curve (AUC) and the mean average precision (MAP). We demonstrated that our parsimonious models, which include only the top 20 most important features comprising 5 demographic features and 15 non-clinical features (13 molecular and 2 biological), achieve ADR prediction performance comparable to a less practical, feature-rich model consisting of all 2,315 features. Specifically, our models achieved an AUC of 0.611 and 0.674 for RF and DL algorithms, respectively. We hope our research provides researchers and clinicians with valuable insights and facilitates future research designs by identifying top ADR predictors (including demographic information) and practical parsimonious models.
引用
收藏
页数:1
相关论文
empty
未找到相关数据