Machine learning-based prediction of drug approvals using molecular, physicochemical, clinical trial, and patent-related features

被引:1
作者
Ciray, Fulya [1 ,2 ]
Dogan, Tunca [1 ,3 ,4 ]
机构
[1] Hacettepe Univ, Dept Comp Engn, Biol Data Sci Lab, Ankara, Turkey
[2] METU, Grad Sch Informat, Dept Hlth Informat, Ankara, Turkey
[3] Hacettepe Univ, Inst Informat, Dept Hlth Informat, Ankara, Turkey
[4] Hacettepe Univ, Grad Sch Hlth Sci, Dept Bioinformat, Ankara, Turkey
关键词
Approval of drugs; clinical trials; drug patents; machine learning; molecular structures; physicochemical properties; predictive modeling; REGULATORY APPROVAL; MISSING DATA; DISCOVERY; DATABASE;
D O I
10.1080/17460441.2023.2153830
中图分类号
R9 [药学];
学科分类号
1007 ;
摘要
Background: Drug development productivity has been declining lately due to elevated costs and reduced discovery rates. Therefore, pharmaceutical companies have been seeking alternative ways to determine and evaluate drug candidates.Research design and methods: In this work, we proposed a new computational approach to directly predict the regulatory approval of drug candidates, and implemented it as a method called "DrugApp.' To accomplish this task, we employed multiple types of features including molecular and physicochem-ical properties of drug candidates, together with clinical trial and patent-related features, which are then processed by random forest classifiers to train our disease group-specific approval prediction models.Results: Our evaluations indicated DrugApp has a high and robust prediction performance. Within a use-case study, we showed our method can predict phase IV trial drugs that are later withdrawn from the market due to severe side effects. Finally, we used DrugApp models to forecast the approval of drug candidates that are currently in phases I/II/III of clinical trials.Conclusions: We hope that our study will aid the research community in terms of evaluating and improving the process of drug development. The datasets, source code, results, and pre-trained models of DrugApp are freely available at https://github.com/HUBioDataLab/DrugApp.
引用
收藏
页码:1425 / 1441
页数:17
相关论文
共 45 条
[1]  
[Anonymous], PATENTSVIEW DAT
[2]  
Artemov AV., 2016, BIORXIV, DOI [10.1101/095653v2, DOI 10.1101/095653V2]
[3]  
Behera B, 2019, INT CONF ADV COMPU, P220, DOI [10.1109/icoac48765.2019.246843, 10.1109/ICoAC48765.2019.246843]
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]  
Breiman L., 1984, Classification and Regression Trees, V1st, DOI [DOI 10.1201/9781315139470, 10.1201/9781315139470]
[6]  
Brodersen Kay H., 2010, Proceedings of the 2010 20th International Conference on Pattern Recognition (ICPR 2010), P3121, DOI 10.1109/ICPR.2010.764
[7]   Changes in Clinical Trials Methodology Over Time: A Systematic Review of Six Decades of Research in Psychopharmacology [J].
Brunoni, Andre R. ;
Tadini, Laura ;
Fregni, Felipe .
PLOS ONE, 2010, 5 (03)
[8]   The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation [J].
Chicco, Davide ;
Jurman, Giuseppe .
BMC GENOMICS, 2020, 21 (01)
[9]   A Tool for Predicting Regulatory Approval After Phase II Testing of New Oncology Compounds [J].
DiMasi, J. A. ;
Hermann, J. C. ;
Twyman, K. ;
Kondru, R. K. ;
Stergiopoulos, S. ;
Getz, K. A. ;
Rackoff, W. .
CLINICAL PHARMACOLOGY & THERAPEUTICS, 2015, 98 (05) :506-513
[10]   Protein domain-based prediction of drug/compound-target interactions and experimental validation on LIM kinases [J].
Dogan, Tunca ;
Guzelcan, Ece Akhan ;
Baumann, Marcus ;
Koyas, Altay ;
Atas, Heval ;
Baxendale, Ian ;
Martin, Maria ;
Cetin-Atalay, Rengul .
PLOS COMPUTATIONAL BIOLOGY, 2021, 17 (11)